submitted by /u/ytcoinartist
[link] [comments]
( 41
min )
A recent podcast interview of EY's has gone a bit viral, and in it he claims that researchers have dismissed his views without seriously engaging with his arguments, which are described here in relative detail.
I'm aware of on-going AI safety and interpretability research, but the dual use of the term "AI safety" to mean something close to AI ethics, and something close to preventing an existential threat to humanity, makes distinguishing the goals of, say, Anthropic, and the extent to which they consider the latter a serious concern, difficult as a layperson.
I haven't personally found EY's arguments to be particularly rigorous, but I'm not the best suited person to evaluate their validity. Any thoughts are appreciated. Thanks in advance!
submitted by /u/SchmidhuberDidIt
[link] [comments]
( 44
min )
submitted by /u/Number_5_alive
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 41
min )
submitted by /u/kumarsaksham1891
[link] [comments]
( 41
min )
submitted by /u/CHRILLCAST
[link] [comments]
( 41
min )
submitted by /u/Phishstixxx
[link] [comments]
( 41
min )
submitted by /u/thedragod
[link] [comments]
( 41
min )
submitted by /u/tlokjock
[link] [comments]
( 41
min )
submitted by /u/ai-lover
[link] [comments]
( 41
min )
submitted by /u/Alarming-Recipe2857
[link] [comments]
( 41
min )
submitted by /u/sopmac21379
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 43
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 41
min )
submitted by /u/TallSide7746
[link] [comments]
( 43
min )
submitted by /u/ytcoinartist
[link] [comments]
( 41
min )
"Hotter take: ML would have advanced faster if another front-end language had been available and widely adopted instead of Python. One that is interactive yet fast & compilable, multithreaded (no GIL), isn't bloated, doesn't care about white spaces,... E.g. Julia or some Lisp."
Link from the original tweet
submitted by /u/Marcapiel
[link] [comments]
( 60
min )
Over the last 10 years, a number of players have developed autonomous vehicle (AV) systems using deep neural networks (DNNs). These systems have evolved from simple rule-based systems to Advanced Driver Assistance Systems (ADAS) and fully autonomous vehicles. These systems require petabytes of data and thousands of compute units (vCPUs and GPUs) to train. This […]
( 11
min )
submitted by /u/yachay_ai
[link] [comments]
( 41
min )
https://www.legoscript.com/we-will-die-if-not-careful
submitted by /u/pyactee
[link] [comments]
( 44
min )
submitted by /u/gwern
[link] [comments]
( 42
min )
submitted by /u/thejashGI
[link] [comments]
( 40
min )
The do-it-yourself climate modeling movement is here. Researchers from Northwestern University and Argonne National Laboratory have been launching NVIDIA Jetson-driven edge computing Waggle devices across the globe to collect hyper-local climate information. Waggle is an open source sensor platform for edge computing developed by Argonne. Working with this, scientists share open-source AI code designed for Read article >
( 6
min )
A million developers across the globe are now using the NVIDIA Jetson platform for edge AI and robotics to build innovative technologies. Plus, more than 6,000 companies — a third of which are startups — have integrated the platform with their products. These milestones and more will be celebrated during the NVIDIA Jetson Edge AI Read article >
( 6
min )
To drive the automotive industry forward, NVIDIA and Mercedes-Benz are taking the virtual road. NVIDIA founder and CEO Jensen Huang joined Mercedes-Benz CEO Ola Källenius on stage at the automaker’s strategy update event yesterday in Silicon Valley, showcasing progress in their landmark partnership to digitalize the entire product lifecycle, plus the ownership and automated driving Read article >
( 6
min )
The cloud just got bigger. NVIDIA and Microsoft announced this week they’re working to bring top PC Xbox Game Studios games to the GeForce NOW library, including titles from Bethesda, Mojang Studios and Activision, pending closure of Microsoft’s acquisition. With six new games joining the cloud this week for members to stream, it’s a jam-packed Read article >
( 5
min )
submitted by /u/GodGivenRx
[link] [comments]
( 40
min )
submitted by /u/timothy-ventura
[link] [comments]
( 41
min )
submitted by /u/Moneyguy2323
[link] [comments]
( 47
min )
submitted by /u/theindianappguy
[link] [comments]
( 41
min )
submitted by /u/dcastm
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/Ziinxx
[link] [comments]
( 40
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/TatianaW
[link] [comments]
( 41
min )
This post is co-written with Swagata Ashwani, Senior Data Scientist at Boomi. Boomi is an enterprise-level software as a service (SaaS) independent software vendor (ISV) that creates developer enablement tooling for software engineers. These tools integrate via API into Boomi’s core service offering. In this post, we discuss how Boomi used the bring-your-own-container (BYOC) approach […]
( 8
min )
"Deep learning is the only thing that currently works at scale it's the only class of algorithms that is able to discover arbitrary functions in a reasonable amount of time."
https://www.youtube.com/watch?v=p-OYPRhqRCg
I know of the universal approximation theorem. But is there any mathematical formulation of this statement?
submitted by /u/GraciousReformer
[link] [comments]
( 50
min )
submitted by /u/Ziinxx
[link] [comments]
( 40
min )
submitted by /u/auto_mata
[link] [comments]
( 41
min )
Laptops equipped with NVIDIA GeForce RTX 4070, 4060 and 4050 GPUs are now available. The new lineup — including NVIDIA Studio-validated laptops from ASUS, GIGABYTE and Samsung — gives creators more options to create from anywhere with lighter, thinner devices that dramatically exceed the performance of the last generation.
( 8
min )
submitted by /u/jamesj
[link] [comments]
( 41
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 41
min )
Similar to product explainer video like here: https://www.youtube.com/playlist?list=PL2P1Z-F3mmqxsMlpCp6wpeqAqlusiuZ_h
I've tried different services, but either the result was not good enough (e.g. Steve.ai has a "script to animation", but the result was very limited) or the service was not covering the case of script to video (e.g. https://www.synthesia.io/)
submitted by /u/muran123456
[link] [comments]
( 41
min )
submitted by /u/jaxsondeville
[link] [comments]
( 46
min )
submitted by /u/Machine_Minds
[link] [comments]
( 41
min )
submitted by /u/VausProd
[link] [comments]
( 41
min )
I have a lot of photos in my portfolio website and usually post them on social media by series like this example but want to find some new and creative ways to combine/curate photos in a different way which is visually appealing. And to come up with some new ideas outside of my own head I thought, maybe there is a tool that can help with ideas.
submitted by /u/Northlandscapes
[link] [comments]
( 41
min )
submitted by /u/TheRPGGamerMan
[link] [comments]
( 41
min )
submitted by /u/henlo_there_fren
[link] [comments]
( 41
min )
submitted by /u/Reinfeldx
[link] [comments]
( 41
min )
submitted by /u/BeefarmRich
[link] [comments]
( 41
min )
submitted by /u/TallSide7746
[link] [comments]
( 43
min )
After you build, train, and evaluate your machine learning (ML) model to ensure it’s solving the intended business problem proposed, you want to deploy that model to enable decision-making in business operations. Models that support business-critical functions are deployed to a production environment where a model release strategy is put in place. Given the nature […]
( 15
min )
We’re thrilled to announce an expanded collaboration between AWS and Hugging Face to accelerate the training, fine-tuning, and deployment of large language and vision models used to create generative AI applications. Generative AI applications can perform a variety of tasks, including text summarization, answering questions, code generation, image creation, and writing essays and articles. AWS […]
( 4
min )
Announcements Data passivity and the current obsession with off-the-shelf chatbots Last September, Bill Schmarzo (“Point – Counterpoint on Why Organizations Suck at AI”) listed a few common excuses enterprises use to explain why they aren’t doing more with AI: We Don’t Have the Right Talent. “We can’t hire the right talent and don’t have bottomless budgets… Read More »DSC Weekly 21 February 2023 – Data Passivity and the Current Obsession with Off-The-Shelf Chatbots
The post DSC Weekly 21 February 2023 – Data Passivity and the Current Obsession with Off-The-Shelf Chatbots appeared first on Data Science Central.
( 20
min )
With every passing year, data analytics services are gaining more prominence as most enterprises are realizing the potential of data in driving important business decisions. The growing availability of data, developments in technology, and mounting demand for data-driven insights will contribute to this trend. Additionally, the upsurge of big data and cloud computing will make it easier… Read More »The Impact of AI-enabled Data Analytics Services Across Major Industries
The post The Impact of AI-enabled Data Analytics Services Across Major Industries appeared first on Data Science Central.
( 22
min )
Cybercriminals still attack startup businesses even though they may have smaller databases and less information to steal compared to the big players in the market. Why? Bad actors take the path of least resistance, and startups tend to be less equipped to defend against cyber attacks, spending an average of $500 or less on cybersecurity.… Read More »How to Build a Robust Cybersecurity Strategy for Your Startup
The post How to Build a Robust Cybersecurity Strategy for Your Startup appeared first on Data Science Central.
( 24
min )
The telecommunications industry has for decades helped advance revolutionary change – enabling everything from telephones and television to online streaming and self-driving cars. Yet the industry has long been considered an evolutionary mover in its own business. A recent survey of more than 400 telecommunications industry professionals from around the world found that same cautious Read article >
( 6
min )
Structural information of phylogenetic tree topologies plays an important
role in phylogenetic inference. However, finding appropriate topological
structures for specific phylogenetic inference tasks often requires significant
design effort and domain expertise. In this paper, we propose a novel
structural representation method for phylogenetic inference based on learnable
topological features. By combining the raw node features that minimize the
Dirichlet energy with modern graph representation learning techniques, our
learnable topological features can provide efficient structural information of
phylogenetic trees that automatically adapts to different downstream tasks
without requiring domain expertise. We demonstrate the effectiveness and
efficiency of our method on a simulated data tree probability estimation task
and a benchmark of challenging real data variational Bayesian phylogenetic
inference problems.
( 2
min )
We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular
adaptive (self-tuning) method for first-order stochastic optimization. Despite
being well studied, existing analyses of this method suffer from various
shortcomings: they either assume some knowledge of the problem parameters,
impose strong global Lipschitz conditions, or fail to give bounds that hold
with high probability. We provide a comprehensive analysis of this basic method
without any of these limitations, in both the convex and non-convex (smooth)
cases, that additionally supports a general ``affine variance'' noise model and
provides sharp rates of convergence in both the low-noise and
high-noise~regimes.
( 2
min )
In this paper, we investigate the impact of stochasticity and large stepsizes
on the implicit regularisation of gradient descent (GD) and stochastic gradient
descent (SGD) over diagonal linear networks. We prove the convergence of GD and
SGD with macroscopic stepsizes in an overparametrised regression setting and
characterise their solutions through an implicit regularisation problem. Our
crisp characterisation leads to qualitative insights about the impact of
stochasticity and stepsizes on the recovered solution. Specifically, we show
that large stepsizes consistently benefit SGD for sparse regression problems,
while they can hinder the recovery of sparse solutions for GD. These effects
are magnified for stepsizes in a tight window just below the divergence
threshold, in the ``edge of stability'' regime. Our findings are supported by
experimental results.
( 2
min )
We develop inductive biases for the machine learning of complex physical
systems based on the port-Hamiltonian formalism. To satisfy by construction the
principles of thermodynamics in the learned physics (conservation of energy,
non-negative entropy production), we modify accordingly the port-Hamiltonian
formalism so as to achieve a port-metriplectic one. We show that the
constructed networks are able to learn the physics of complex systems by parts,
thus alleviating the burden associated to the experimental characterization and
posterior learning process of this kind of systems. Predictions can be done,
however, at the scale of the complete system. Examples are shown on the
performance of the proposed technique.
( 2
min )
Federated learning (FL) is a privacy-preserving learning technique that
enables distributed computing devices to train shared learning models across
data silos collaboratively. Existing FL works mostly focus on designing
advanced FL algorithms to improve the model performance. However, the economic
considerations of the clients, such as fairness and incentive, are yet to be
fully explored. Without such considerations, self-motivated clients may lose
interest and leave the federation. To address this problem, we designed a novel
incentive mechanism that involves a client selection process to remove
low-quality clients and a money transfer process to ensure a fair reward
distribution. Our experimental results strongly demonstrate that the proposed
incentive mechanism can effectively improve the duration and fairness of the
federation.
( 2
min )
submitted by /u/TimeNeighborhood3869
[link] [comments]
( 40
min )
submitted by /u/thedragod
[link] [comments]
( 40
min )
submitted by /u/magenta_placenta
[link] [comments]
( 41
min )
submitted by /u/SanatanCharacters
[link] [comments]
( 40
min )
submitted by /u/freshthreadshop
[link] [comments]
( 40
min )
submitted by /u/Chisom1998_
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 41
min )
submitted by /u/Economy_Vacation_761
[link] [comments]
( 40
min )
submitted by /u/theindianappguy
[link] [comments]
( 41
min )
submitted by /u/Alarming-Recipe2857
[link] [comments]
( 40
min )
submitted by /u/cbsudux
[link] [comments]
( 40
min )
submitted by /u/SAT0725
[link] [comments]
( 40
min )
submitted by /u/supergroch
[link] [comments]
( 45
min )
submitted by /u/aizaz-zazii
[link] [comments]
( 41
min )
Hello everyone,
It's that time again, thank you all so much for the support you've given us over here. I've done a ton of typing this morning, so for a summary of what I've updated, you can see the higher-level twitter thread I wrote at https://twitter.com/hi_tysam/status/1627679672988319746?cxt=HHwWhIC-yb2C15YtAAAA, or the more detailed (but still rough cut) patch notes I wrote this morning at https://github.com/tysam-code/hlb-CIFAR10/releases/tag/v0.5.0
Happy to answer any questions anyone might have, cheers! :D :))))
submitted by /u/tysam_and_co
[link] [comments]
( 43
min )
submitted by /u/OK-I-will-try
[link] [comments]
( 41
min )
In November 2022, we announced that AWS customers can generate images from text with Stable Diffusion models in Amazon SageMaker JumpStart. Stable Diffusion is a deep learning model that allows you to generate realistic, high-quality images and stunning art in just a few seconds. Although creating impressive images can find use in industries ranging from […]
( 18
min )
submitted by /u/Exciting-Company-75
[link] [comments]
( 41
min )
submitted by /u/GodGivenRx
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 40
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 40
min )
submitted by /u/EIDANart
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/SupPandaHugger
[link] [comments]
( 42
min )
https://avaturn.me/
submitted by /u/theaiguru
[link] [comments]
( 40
min )
submitted by /u/Knight_Fisher61
[link] [comments]
( 40
min )
submitted by /u/walt74
[link] [comments]
( 47
min )
submitted by /u/motivationinsta
[link] [comments]
( 40
min )
submitted by /u/slavaMZ
[link] [comments]
( 41
min )
submitted by /u/cobalt1137
[link] [comments]
( 42
min )
submitted by /u/radi-cho
[link] [comments]
( 43
min )
I have written a blog post explaining the Barlow Twins paper from Meta AI. Can you guys have a read and provide suggestions to improve it further? Thanks in advance!
https://pmgautam.com/posts/barlow-twins-explanation.html
submitted by /u/pmgautam_
[link] [comments]
( 42
min )
submitted by /u/MysteryInc152
[link] [comments]
( 42
min )
I am following this implementation of ddpg and found this code -
self.linear3.weight.data.uniform_(-init_w, init_w)
]It seems like the author is forcing the weights of the final layer to follow a uniform distribution.
Why is the author only replacing the final layer weights?
How does uniform weight initialization help?
I have heard a lot about the usefulness of Orthogonal initialization. This is the first time, I have seen the above type of initialization.
submitted by /u/Academic-Rent7800
[link] [comments]
( 42
min )
submitted by /u/radi-cho
[link] [comments]
( 43
min )
submitted by /u/hayAbhay
[link] [comments]
( 44
min )
submitted by /u/Lumpek
[link] [comments]
( 40
min )
submitted by /u/Lukmin1999
[link] [comments]
( 42
min )
submitted by /u/timothy-ventura
[link] [comments]
( 41
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/Alarming-Recipe2857
[link] [comments]
( 40
min )
submitted by /u/OnlyProggingForFun
[link] [comments]
( 41
min )
submitted by /u/jrstelle
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Philo167
[link] [comments]
( 40
min )
submitted by /u/ssigea
[link] [comments]
( 42
min )
submitted by /u/Mk_Makanaki
[link] [comments]
( 42
min )
submitted by /u/malirkan
[link] [comments]
( 41
min )
submitted by /u/Zirius_Sadfaces
[link] [comments]
( 40
min )
submitted by /u/Reddit_Anon22
[link] [comments]
( 43
min )
This note describes a new approach to classifying graphs that leverages graph
generative models (GGM). Assuming a GGM that defines a joint probability
distribution over graphs and their class labels, I derive classification
formulas for the probability of a class label given a graph. A new conditional
ELBO can be used to train a generative graph auto-encoder model for
discrimination. While leveraging generative models for classification has been
well explored for non-relational i.i.d. data, to our knowledge it is a novel
approach to graph classification.
( 2
min )
This work explains in detail the theory behind Complex-Valued Neural Network
(CVNN), including Wirtinger calculus, complex backpropagation, and basic
modules such as complex layers, complex activation functions, or complex weight
initialization. We also show the impact of not adapting the weight
initialization correctly to the complex domain. This work presents a strong
focus on the implementation of such modules on Python using cvnn toolbox. We
also perform simulations on real-valued data, casting to the complex domain by
means of the Hilbert Transform, and verifying the potential interest of CVNN
even for non-complex data.
( 2
min )
Lung cancer is the leading cause of death among different types of cancers.
Every year, the lives lost due to lung cancer exceed those lost to pancreatic,
breast, and prostate cancer combined. The survival rate for lung cancer
patients is very low compared to other cancer patients due to late diagnostics.
Thus, early lung cancer diagnostics is crucial for patients to receive early
treatments, increasing the survival rate or even becoming cancer-free. This
paper proposed a deep-learning model for early lung cancer prediction and
diagnosis from Computed Tomography (CT) scans. The proposed mode achieves high
accuracy. In addition, it can be a beneficial tool to support radiologists'
decisions in predicting and detecting lung cancer and its stage.
( 2
min )
Graph neural networks (GNNs) are able to leverage the structure of graph data
by passing messages along the edges of the graph. While this allows GNNs to
learn features depending on the graph structure, for certain graph topologies
it leads to inefficient information propagation and a problem known as
oversquashing. This has recently been linked with the curvature and spectral
gap of the graph. On the other hand, adding edges to the message-passing graph
can lead to increasingly similar node representations and a problem known as
oversmoothing. We propose a computationally efficient algorithm that prevents
oversquashing by systematically adding edges to the graph based on spectral
expansion. We combine this with a relational architecture, which lets the GNN
preserve the original graph structure and provably prevents oversmoothing. We
find experimentally that our algorithm outperforms existing graph rewiring
methods in several graph classification tasks.
( 2
min )
In this work, we propose a zero-shot voice conversion method using speech
representations trained with self-supervised learning. First, we develop a
multi-task model to decompose a speech utterance into features such as
linguistic content, speaker characteristics, and speaking style. To disentangle
content and speaker representations, we propose a training strategy based on
Siamese networks that encourages similarity between the content representations
of the original and pitch-shifted audio. Next, we develop a synthesis model
with pitch and duration predictors that can effectively reconstruct the speech
signal from its decomposed representation. Our framework allows controllable
and speaker-adaptive synthesis to perform zero-shot any-to-any voice conversion
achieving state-of-the-art results on metrics evaluating speaker similarity,
intelligibility, and naturalness. Using just 10 seconds of data for a target
speaker, our framework can perform voice swapping and achieves a speaker
verification EER of 5.5% for seen speakers and 8.4% for unseen speakers.
( 2
min )
The increasing application of Artificial Intelligence and Machine Learning
models poses potential risks of unfair behavior and, in light of recent
regulations, has attracted the attention of the research community. Several
researchers focused on seeking new fairness definitions or developing
approaches to identify biased predictions. However, none try to exploit the
counterfactual space to this aim. In that direction, the methodology proposed
in this work aims to unveil unfair model behaviors using counterfactual
reasoning in the case of fairness under unawareness setting. A counterfactual
version of equal opportunity named counterfactual fair opportunity is defined
and two novel metrics that analyze the sensitive information of counterfactual
samples are introduced. Experimental results on three different datasets show
the efficacy of our methodologies and our metrics, disclosing the unfair
behavior of classic machine learning and debiasing models.
( 2
min )
Spherical harmonics provide a smooth, orthogonal, and symmetry-adapted basis
to expand functions on a sphere, and they are used routinely in computer
graphics, signal processing and different fields of science, from geology to
quantum chemistry. More recently, spherical harmonics have become a key
component of rotationally equivariant models for geometric deep learning, where
they are used in combination with distance-dependent functions to describe the
distribution of neighbors within local spherical environments within a point
cloud. We present a fast and elegant algorithm for the evaluation of the
real-valued spherical harmonics. Our construction integrates many of the
desirable features of existing schemes and allows to compute Cartesian
derivatives in a numerically stable and computationally efficient manner. We
provide an efficient C implementation of the proposed algorithm, along with
easy-to-use Python bindings.
( 2
min )
We present Trieste, an open-source Python package for Bayesian optimization
and active learning benefiting from the scalability and efficiency of
TensorFlow. Our library enables the plug-and-play of popular TensorFlow-based
models within sequential decision-making loops, e.g. Gaussian processes from
GPflow or GPflux, or neural networks from Keras. This modular mindset is
central to the package and extends to our acquisition functions and the
internal dynamics of the decision-making loop, both of which can be tailored
and extended by researchers or engineers when tackling custom use cases.
Trieste is a research-friendly and production-ready toolkit backed by a
comprehensive test suite, extensive documentation, and available at
https://github.com/secondmind-labs/trieste.
( 2
min )
In this work we developed a deep learning technique that successfully solves
a non-linear dynamic control problem. Instead of directly tackling the control
problem, we combined methods in probabilistic neural networks and a
Kalman-Filter-inspired model to build a non-linear state estimator for the
system. We then used the estimated states to implement a trivial controller for
the now fully observable system. We applied this technique to a crucial
non-linear control problem that arises in the operation of the LIGO system, an
interferometric gravitational-wave observatory. We demonstrated in simulation
that our approach can learn from data to estimate the state of the system,
allowing a successful control of the interferometer's mirror . We also
developed a computationally efficient model that can run in real time at high
sampling rate on a single modern CPU core, one of the key requirements for the
implementation of our solution in the LIGO digital control system. We believe
these techniques could be used to help tackle similar non-linear control
problems in other applications.
( 2
min )
Robotics, automation, and related Artificial Intelligence (AI) systems have
become pervasive bringing in concerns related to security, safety, accuracy,
and trust. With growing dependency on physical robots that work in close
proximity to humans, the security of these systems is becoming increasingly
important to prevent cyber-attacks that could lead to privacy invasion,
critical operations sabotage, and bodily harm. The current shortfall of
professionals who can defend such systems demands development and integration
of such a curriculum. This course description includes details about seven
self-contained and adaptive modules on "AI security threats against pervasive
robotic systems". Topics include: 1) Introduction, examples of attacks, and
motivation; 2) - Robotic AI attack surfaces and penetration testing; 3) -
Attack patterns and security strategies for input sensors; 4) - Training
attacks and associated security strategies; 5) - Inference attacks and
associated security strategies; 6) - Actuator attacks and associated security
strategies; and 7) - Ethics of AI, robotics, and cybersecurity.
( 2
min )
Decentralised Machine Learning (DML) enables collaborative machine learning
without centralised input data. Federated Learning (FL) and Edge Inference are
examples of DML. While tools for DML (especially FL) are starting to flourish,
many are not flexible and portable enough to experiment with novel systems
(e.g., RISC-V), non-fully connected topologies, and asynchronous collaboration
schemes. We overcome these limitations via a domain-specific language allowing
to map DML schemes to an underlying middleware, i.e. the \ff parallel
programming library. We experiment with it by generating different working DML
schemes on two emerging architectures (ARM-v8, RISC-V) and the x86-64 platform.
We characterise the performance and energy efficiency of the presented schemes
and systems. As a byproduct, we introduce a RISC-V porting of the PyTorch
framework, the first publicly available to our knowledge.
( 2
min )
This paper considers the use of recently proposed optimal transport-based
multivariate test statistics, namely rank energy and its variant the soft rank
energy derived from entropically regularized optimal transport, for the
unsupervised nonparametric change point detection (CPD) problem. We show that
the soft rank energy enjoys both fast rates of statistical convergence and
robust continuity properties which lead to strong performance on real datasets.
Our theoretical analyses remove the need for resampling and out-of-sample
extensions previously required to obtain such rates. In contrast the rank
energy suffers from the curse of dimensionality in statistical estimation and
moreover can signal a change point from arbitrarily small perturbations, which
leads to a high rate of false alarms in CPD. Additionally, under mild
regularity conditions, we quantify the discrepancy between soft rank energy and
rank energy in terms of the regularization parameter. Finally, we show our
approach performs favorably in numerical experiments compared to several other
optimal transport-based methods as well as maximum mean discrepancy.
( 2
min )
We consider the problem of testing the identity of a reversible Markov chain
against a reference from a single trajectory of observations. Employing the
recently introduced notion of a lumping-congruent Markov embedding, we show
that, at least in a mildly restricted setting, testing identity to a reversible
chain reduces to testing to a symmetric chain over a larger state space and
recover state-of-the-art sample complexity for the problem.
( 2
min )
This work explains in detail the theory behind Complex-Valued Neural Network
(CVNN), including Wirtinger calculus, complex backpropagation, and basic
modules such as complex layers, complex activation functions, or complex weight
initialization. We also show the impact of not adapting the weight
initialization correctly to the complex domain. This work presents a strong
focus on the implementation of such modules on Python using cvnn toolbox. We
also perform simulations on real-valued data, casting to the complex domain by
means of the Hilbert Transform, and verifying the potential interest of CVNN
even for non-complex data.
( 2
min )
Many novel notions of "risk" (e.g., CVaR, tilted risk, DRO risk) have been
proposed and studied, but these risks are all at least as sensitive as the mean
to loss tails on the upside, and tend to ignore deviations on the downside. We
study a complementary new risk class that penalizes loss deviations in a
bi-directional manner, while having more flexibility in terms of tail
sensitivity than is offered by mean-variance. This class lets us derive
high-probability learning guarantees without explicit gradient clipping, and
empirical tests using both simulated and real data illustrate a high degree of
control over key properties of the test loss distribution incurred by
gradient-based learners.
( 2
min )
Utility-Based Shortfall Risk (UBSR) is a risk metric that is increasingly
popular in financial applications, owing to certain desirable properties that
it enjoys. We consider the problem of estimating UBSR in a recursive setting,
where samples from the underlying loss distribution are available
one-at-a-time. We cast the UBSR estimation problem as a root finding problem,
and propose stochastic approximation-based estimations schemes. We derive
non-asymptotic bounds on the estimation error in the number of samples. We also
consider the problem of UBSR optimization within a parameterized class of
random variables. We propose a stochastic gradient descent based algorithm for
UBSR optimization, and derive non-asymptotic bounds on its convergence.
( 2
min )
This paper considers the use of recently proposed optimal transport-based
multivariate test statistics, namely rank energy and its variant the soft rank
energy derived from entropically regularized optimal transport, for the
unsupervised nonparametric change point detection (CPD) problem. We show that
the soft rank energy enjoys both fast rates of statistical convergence and
robust continuity properties which lead to strong performance on real datasets.
Our theoretical analyses remove the need for resampling and out-of-sample
extensions previously required to obtain such rates. In contrast the rank
energy suffers from the curse of dimensionality in statistical estimation and
moreover can signal a change point from arbitrarily small perturbations, which
leads to a high rate of false alarms in CPD. Additionally, under mild
regularity conditions, we quantify the discrepancy between soft rank energy and
rank energy in terms of the regularization parameter. Finally, we show our
approach performs favorably in numerical experiments compared to several other
optimal transport-based methods as well as maximum mean discrepancy.
( 2
min )
I have constructed a novel ML (NLP) dataset for classification and labeled it with three classes. The dataset is rather small with about 700 examples, out of which the classes have about 400, 200, and 100 examples respectively. I would like to publish it and describe it in an official publication for a workshop or a conference.
When looking at related datasets and publication, I see that it is common for authors to publish the dataset already split into three chunks - train, dev, test dataset (see the images). It is also common in these papers to provide the performance of baseline models on the dataset. Considering the dataset's small size, I feel like doing a 5-fold cross-validation would be a good alternative for such a small dataset, rather than doing something like a split into 450-1…
( 46
min )
submitted by /u/AlternativeFee1
[link] [comments]
( 41
min )
submitted by /u/kiabarocha
[link] [comments]
( 40
min )
submitted by /u/DunMiff--Sys
[link] [comments]
( 41
min )
Do you think AI will be able to give trustable advice in the future?
Doing research for a school project.If you have the time I would appreciate it if you could fill this form out.
https://forms.gle/X7Fg8cQsqWb278bm7
View Poll
submitted by /u/Jakets_V
[link] [comments]
( 41
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
The more significant ChatGPT usage is becoming, the more concerns the tool is raising.
What do you think: is it an incredible source of inspiration or the death of art as we know it?
Would you be able to distinguish between AI-generated text and human poetry?
Take part in the experiment and share your thoughts here: ChatGPT Survey.
submitted by /u/Lonely-Wish-6377
[link] [comments]
( 41
min )
submitted by /u/RushingRobotics_com
[link] [comments]
( 40
min )
submitted by /u/LightOfAntara
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/DunMiff--Sys
[link] [comments]
( 41
min )
submitted by /u/Kiizmod0
[link] [comments]
( 43
min )
North American reinforcement materials market is anticipated to display revenue growth at a CAGR of 5.64% by 2028. Get free sample report
North America Reinforcement Materials Market
submitted by /u/shreyaslakhare11
[link] [comments]
( 41
min )
Middle East and Africa reinforcement materials market is probable to grow as per projected to witness growth at a CAGR of 5.13% by 2028. Get free sample report
Middle East and Africa Reinforcement Materials Market
submitted by /u/shreyaslakhare11
[link] [comments]
( 41
min )
Europe’s reinforcement materials market is likely to register growth at a CAGR of 5.87% based on revenue during the period 2021-2028. Get free sample report
Europe Reinforcement Materials Market
submitted by /u/shreyaslakhare11
[link] [comments]
( 41
min )
Asia-Pacific reinforcement materials market is assessed to display growth at 6.33% of CAGR in the forecasting years 2021-2028. Get free sample report
Asia-Pacific Reinforcement Materials Market
submitted by /u/shreyaslakhare11
[link] [comments]
( 41
min )
The Global Reinforcement Materials Market is estimated to grow at a CAGR of 6.02%, and is likely to garner $12826 million by 2028. Get a Free Sample Report
Reinforcement Materials Market
submitted by /u/shreyaslakhare11
[link] [comments]
( 41
min )
We consider the optimal sample complexity theory of tabular reinforcement
learning (RL) for controlling the infinite horizon discounted reward in a
Markov decision process (MDP). Optimal min-max complexity results have been
developed for tabular RL in this setting, leading to a sample complexity
dependence on $\gamma$ and $\epsilon$ of the form $\tilde
\Theta((1-\gamma)^{-3}\epsilon^{-2})$, where $\gamma$ is the discount factor
and $\epsilon$ is the tolerance solution error. However, in many applications
of interest, the optimal policy (or all policies) will induce mixing. We show
that in these settings the optimal min-max complexity is $\tilde
\Theta(t_{\text{minorize}}(1-\gamma)^{-2}\epsilon^{-2})$, where
$t_{\text{minorize}}$ is a measure of mixing that is within an equivalent
factor of the total variation mixing time. Our analysis is based on
regeneration-type ideas, that, we believe are of independent interest since
they can be used to study related problems for general state space MDPs.
( 2
min )
Variational inequalities are a broad and flexible class of problems that
includes minimization, saddle point, fixed point problems as special cases.
Therefore, variational inequalities are used in a variety of applications
ranging from equilibrium search to adversarial learning. Today's realities with
the increasing size of data and models demand parallel and distributed
computing for real-world machine learning problems, most of which can be
represented as variational inequalities. Meanwhile, most distributed approaches
has a significant bottleneck - the cost of communications. The three main
techniques to reduce both the total number of communication rounds and the cost
of one such round are the use of similarity of local functions, compression of
transmitted information and local updates. In this paper, we combine all these
approaches. Such a triple synergy did not exist before for variational
inequalities and saddle problems, nor even for minimization problems. The
methods presented in this paper have the best theoretical guarantees of
communication complexity and are significantly ahead of other methods for
distributed variational inequalities. The theoretical results are confirmed by
adversarial learning experiments on synthetic and real datasets.
( 2
min )
We prove that various stochastic gradient descent methods, including the
stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic
Nesterov's accelerated gradient (SNAG) methods, almost surely avoid any strict
saddle manifold. To the best of our knowledge, this is the first time such
results are obtained for SHB and SNAG methods. Moreover, our analysis expands
upon previous studies on SGD by removing the need for bounded gradients of the
objective function and uniformly bounded noise. Instead, we introduce a more
practical local boundedness assumption for the noisy gradient, which is
naturally satisfied in empirical risk minimization problems typically seen in
training of neural networks.
( 2
min )
Mitigating the discrimination of machine learning models has gained
increasing attention in medical image analysis. However, rare works focus on
fair treatments for patients with multiple sensitive demographic ones, which is
a crucial yet challenging problem for real-world clinical applications. In this
paper, we propose a novel method for fair representation learning with respect
to multi-sensitive attributes. We pursue the independence between target and
multi-sensitive representations by achieving orthogonality in the
representation space. Concretely, we enforce the column space orthogonality by
keeping target information on the complement of a low-rank sensitive space.
Furthermore, in the row space, we encourage feature dimensions between target
and sensitive representations to be orthogonal. The effectiveness of the
proposed method is demonstrated with extensive experiments on the CheXpert
dataset. To our best knowledge, this is the first work to mitigate unfairness
with respect to multiple sensitive attributes in the field of medical imaging.
( 2
min )
We present a new convolution layer for deep learning architectures which we
call QuadConv -- an approximation to continuous convolution via quadrature. Our
operator is developed explicitly for use on non-uniform, mesh-based data, and
accomplishes this by learning a continuous kernel that can be sampled at
arbitrary locations. Moreover, the construction of our operator admits an
efficient implementation which we detail and construct. In the setting of
compressing data arising from partial differential equation (PDE) simulations,
we show that QuadConv can match the performance of standard discrete
convolutions on uniform grid data by comparing a QuadConv autoencoder (QCAE) to
a standard convolutional autoencoder (CAE). Further, we show that the QCAE can
maintain this accuracy even on non-uniform data.
( 2
min )
A current goal in the graph neural network literature is to enable
transformers to operate on graph-structured data, given their success on
language and vision tasks. Since the transformer's original sinusoidal
positional encodings (PEs) are not applicable to graphs, recent work has
focused on developing graph PEs, rooted in spectral graph theory or various
spatial features of a graph. In this work, we introduce a new graph PE, Graph
Automaton PE (GAPE), based on weighted graph-walking automata (a novel
extension of graph-walking automata). We compare the performance of GAPE with
other PE schemes on both machine translation and graph-structured tasks, and we
show that it generalizes several other PEs. An additional contribution of this
study is a theoretical and controlled experimental comparison of many recent
PEs in graph transformers, independent of the use of edge features.
( 2
min )
Molecular conformation generation (MCG) is a fundamental and important
problem in drug discovery. Many traditional methods have been developed to
solve the MCG problem, such as systematic searching, model-building, random
searching, distance geometry, molecular dynamics, Monte Carlo methods, etc.
However, they have some limitations depending on the molecular structures.
Recently, there are plenty of deep learning based MCG methods, which claim they
largely outperform the traditional methods. However, to our surprise, we design
a simple and cheap algorithm (parameter-free) based on the traditional methods
and find it is comparable to or even outperforms deep learning based MCG
methods in the widely used GEOM-QM9 and GEOM-Drugs benchmarks. In particular,
our design algorithm is simply the clustering of the RDKIT-generated
conformations. We hope our findings can help the community to revise the deep
learning methods for MCG. The code of the proposed algorithm could be found at
https://gist.github.com/ZhouGengmo/5b565f51adafcd911c0bc115b2ef027c.
( 2
min )
Contrastive learning is a powerful framework for learning self-supervised
representations that generalize well to downstream supervised tasks. We show
that multiple existing contrastive learning methods can be reinterpreted as
learning kernel functions that approximate a fixed positive-pair kernel. We
then prove that a simple representation obtained by combining this kernel with
PCA provably minimizes the worst-case approximation error of linear predictors,
under a straightforward assumption that positive pairs have similar labels. Our
analysis is based on a decomposition of the target function in terms of the
eigenfunctions of a positive-pair Markov chain, and a surprising equivalence
between these eigenfunctions and the output of Kernel PCA. We give
generalization bounds for downstream linear prediction using our Kernel PCA
representation, and show empirically on a set of synthetic tasks that applying
Kernel PCA to contrastive learning models can indeed approximately recover the
Markov chain eigenfunctions, although the accuracy depends on the kernel
parameterization as well as on the augmentation strength.
( 2
min )
Tongue twisters are meaningful sentences that are difficult to pronounce. The
process of automatically generating tongue twisters is challenging since the
generated utterance must satisfy two conditions at once: phonetic difficulty
and semantic meaning. Furthermore, phonetic difficulty is itself hard to
characterize and is expressed in natural tongue twisters through a
heterogeneous mix of phenomena such as alliteration and homophony. In this
paper, we propose PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue
Twisters Automatically. We leverage phoneme representations to capture the
notion of phonetic difficulty, and we train language models to generate
original tongue twisters on two proposed task settings. To do this, we curate a
dataset called PANCETTA, consisting of existing English tongue twisters.
Through automatic and human evaluation, as well as qualitative analysis, we
show that PANCETTA generates novel, phonetically difficult, fluent, and
semantically meaningful tongue twisters.
( 2
min )
The Baum-Welch (B-W) algorithm is the most widely accepted method for
inferring hidden Markov models (HMM). However, it is prone to getting stuck in
local optima, and can be too slow for many real-time applications. Spectral
learning of HMMs (SHMMs), based on the method of moments (MOM) has been
proposed in the literature to overcome these obstacles. Despite its promises,
asymptotic theory for SHMM has been elusive, and the long-run performance of
SHMM can degrade due to unchecked propogation of error. In this paper, we (1)
provide an asymptotic distribution for the approximate error of the likelihood
estimated by SHMM, and (2) propose a novel algorithm called projected SHMM
(PSHMM) that mitigates the problem of error propogation, and (3) develop online
learning variantions of both SHMM and PSHMM that accommodate potential
nonstationarity. We compare the performance of SHMM with PSHMM and estimation
through the B-W algorithm on both simulated data and data from real world
applications, and find that PSHMM not only retains the computational advantages
of SHMM, but also provides more robust estimation and forecasting.
( 2
min )
Arunachalam and De Wolf (2018) showed that the sample complexity of quantum
batch learning of boolean functions, in the realizable and agnostic settings,
has the same form and order as the corresponding classical sample complexities.
In this paper, we extend this, ostensibly surprising, message to batch
multiclass learning, online boolean learning, and online multiclass learning.
For our online learning results, we first consider an adaptive adversary
variant of the classical model of Dawid and Tewari (2022). Then, we introduce
the first (to the best of our knowledge) model of online learning with quantum
examples.
( 2
min )
We propose to explore the potential of physics-informed neural networks
(PINNs) in solving a class of partial differential equations (PDEs) used to
model the propagation of chronic inflammatory bowel diseases, such as Crohn's
disease and ulcerative colitis. An unsupervised approach was privileged during
the deep neural network training. Given the complexity of the underlying
biological system, characterized by intricate feedback loops and limited
availability of high-quality data, the aim of this study is to explore the
potential of PINNs in solving PDEs. In addition to providing this exploratory
assessment, we also aim to emphasize the principles of reproducibility and
transparency in our approach, with a specific focus on ensuring the robustness
and generalizability through the use of artificial intelligence. We will
quantify the relevance of the PINN method with several linear and non-linear
PDEs in relation to biology. However, it is important to note that the final
solution is dependent on the initial conditions, chosen boundary conditions,
and neural network architectures.
( 2
min )
Gamma-Phi losses constitute a family of multiclass classification loss
functions that generalize the logistic and other common losses, and have found
application in the boosting literature. We establish the first general
sufficient condition for the classification-calibration of such losses. In
addition, we show that a previously proposed sufficient condition is in fact
not sufficient.
( 2
min )
The Baum-Welch (B-W) algorithm is the most widely accepted method for
inferring hidden Markov models (HMM). However, it is prone to getting stuck in
local optima, and can be too slow for many real-time applications. Spectral
learning of HMMs (SHMMs), based on the method of moments (MOM) has been
proposed in the literature to overcome these obstacles. Despite its promises,
asymptotic theory for SHMM has been elusive, and the long-run performance of
SHMM can degrade due to unchecked propogation of error. In this paper, we (1)
provide an asymptotic distribution for the approximate error of the likelihood
estimated by SHMM, and (2) propose a novel algorithm called projected SHMM
(PSHMM) that mitigates the problem of error propogation, and (3) develop online
learning variantions of both SHMM and PSHMM that accommodate potential
nonstationarity. We compare the performance of SHMM with PSHMM and estimation
through the B-W algorithm on both simulated data and data from real world
applications, and find that PSHMM not only retains the computational advantages
of SHMM, but also provides more robust estimation and forecasting.
( 2
min )
Dimensionality reduction (DR) plays a vital role in the visual analysis of
high-dimensional data. One main aim of DR is to reveal hidden patterns that lie
on intrinsic low-dimensional manifolds. However, DR often overlooks important
patterns when the manifolds are distorted or masked by certain influential data
attributes. This paper presents a feature learning framework, FEALM, designed
to generate a set of optimized data projections for nonlinear DR in order to
capture important patterns in the hidden manifolds. These projections produce
maximally different nearest-neighbor graphs so that resultant DR outcomes are
significantly different. To achieve such a capability, we design an
optimization algorithm as well as introduce a new graph dissimilarity measure,
named neighbor-shape dissimilarity. Additionally, we develop interactive
visualizations to assist comparison of obtained DR results and interpretation
of each DR result. We demonstrate FEALM's effectiveness through experiments and
case studies using synthetic and real-world datasets.
( 2
min )
Variational inequalities are a broad and flexible class of problems that
includes minimization, saddle point, fixed point problems as special cases.
Therefore, variational inequalities are used in a variety of applications
ranging from equilibrium search to adversarial learning. Today's realities with
the increasing size of data and models demand parallel and distributed
computing for real-world machine learning problems, most of which can be
represented as variational inequalities. Meanwhile, most distributed approaches
has a significant bottleneck - the cost of communications. The three main
techniques to reduce both the total number of communication rounds and the cost
of one such round are the use of similarity of local functions, compression of
transmitted information and local updates. In this paper, we combine all these
approaches. Such a triple synergy did not exist before for variational
inequalities and saddle problems, nor even for minimization problems. The
methods presented in this paper have the best theoretical guarantees of
communication complexity and are significantly ahead of other methods for
distributed variational inequalities. The theoretical results are confirmed by
adversarial learning experiments on synthetic and real datasets.
( 2
min )
We consider the optimal sample complexity theory of tabular reinforcement
learning (RL) for controlling the infinite horizon discounted reward in a
Markov decision process (MDP). Optimal min-max complexity results have been
developed for tabular RL in this setting, leading to a sample complexity
dependence on $\gamma$ and $\epsilon$ of the form $\tilde
\Theta((1-\gamma)^{-3}\epsilon^{-2})$, where $\gamma$ is the discount factor
and $\epsilon$ is the tolerance solution error. However, in many applications
of interest, the optimal policy (or all policies) will induce mixing. We show
that in these settings the optimal min-max complexity is $\tilde
\Theta(t_{\text{minorize}}(1-\gamma)^{-2}\epsilon^{-2})$, where
$t_{\text{minorize}}$ is a measure of mixing that is within an equivalent
factor of the total variation mixing time. Our analysis is based on
regeneration-type ideas, that, we believe are of independent interest since
they can be used to study related problems for general state space MDPs.
( 2
min )
Semi-supervised learning is a powerful technique for leveraging unlabeled
data to improve machine learning models, but it can be affected by the presence
of ``informative'' labels, which occur when some classes are more likely to be
labeled than others. In the missing data literature, such labels are called
missing not at random. In this paper, we propose a novel approach to address
this issue by estimating the missing-data mechanism and using inverse
propensity weighting to debias any SSL algorithm, including those using data
augmentation. We also propose a likelihood ratio test to assess whether or not
labels are indeed informative. Finally, we demonstrate the performance of the
proposed methods on different datasets, in particular on two medical datasets
for which we design pseudo-realistic missing data scenarios.
( 2
min )
In this paper, we propose a model-free feature selection method for
ultra-high dimensional data with mass features. This is a two phases procedure
that we propose to use the fused Kolmogorov filter with the random forest based
RFE to remove model limitations and reduce the computational complexity. The
method is fully nonparametric and can work with various types of datasets. It
has several appealing characteristics, i.e., accuracy, model-free, and
computational efficiency, and can be widely used in practical problems, such as
multiclass classification, nonparametric regression, and Poisson regression,
among others. We show that the proposed method is selection consistent and
$L_2$ consistent under weak regularity conditions. We further demonstrate the
superior performance of the proposed method over other existing methods by
simulations and real data examples.
( 2
min )
Arunachalam and De Wolf (2018) showed that the sample complexity of quantum
batch learning of boolean functions, in the realizable and agnostic settings,
has the same form and order as the corresponding classical sample complexities.
In this paper, we extend this, ostensibly surprising, message to batch
multiclass learning, online boolean learning, and online multiclass learning.
For our online learning results, we first consider an adaptive adversary
variant of the classical model of Dawid and Tewari (2022). Then, we introduce
the first (to the best of our knowledge) model of online learning with quantum
examples.
( 2
min )
Gamma-Phi losses constitute a family of multiclass classification loss
functions that generalize the logistic and other common losses, and have found
application in the boosting literature. We establish the first general
sufficient condition for the classification-calibration of such losses. In
addition, we show that a previously proposed sufficient condition is in fact
not sufficient.
( 2
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/LazyHighGoals
[link] [comments]
( 41
min )
submitted by /u/red3vil96
[link] [comments]
( 41
min )
submitted by /u/Blake_Jonesy
[link] [comments]
( 41
min )
submitted by /u/LorestForest
[link] [comments]
( 42
min )
submitted by /u/ThatManulTheCat
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/citizentim
[link] [comments]
( 40
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 41
min )
Hi everyone, I used several machine vision algorithms to determine the fastest lane on border crossings. I have worked on this for the past few months and would love to know what you think about it. You can check out the detailed steps and code on the medium article in this link.
submitted by /u/andrea_m2000
[link] [comments]
( 41
min )
submitted by /u/globeworldmap
[link] [comments]
( 41
min )
submitted by /u/punkthesystem
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/Meta-Stark
[link] [comments]
( 40
min )
submitted by /u/SAT0725
[link] [comments]
( 44
min )
submitted by /u/fignewtgingrich
[link] [comments]
( 42
min )
submitted by /u/TrainingExtent8699
[link] [comments]
( 40
min )
submitted by /u/Chipdoc
[link] [comments]
( 42
min )
Modern model pre-training often calls for larger cluster deployment to reduce time and cost. At the server level, such training workloads demand faster compute and increased memory allocation. As models grow to hundreds of billions of parameters, they require a distributed training mechanism that spans multiple nodes (instances). In October 2022, we launched Amazon EC2 […]
( 10
min )
Enterprises across the globe are looking to utilize multiple data sources to implement a unified search experience for their employees and end customers. Considering the large volume of data that needs to be examined and indexed, the retrieval speed, solution scalability, and search performance become key factors to consider when choosing an enterprise intelligent search […]
( 7
min )
Novel AI technologies are generating images, stories and, now, new ways to imagine the automotive future. At NVIDIA GTC, a global conference for the era of AI and the metaverse running online March 20-23, industry luminaries working on these breakthroughs will come together and share their visions to transform transportation. This year’s slate of in-depth Read article >
( 5
min )
The video above represents one of the first times that a pangolin, one of the world’s most critically endangered species, was detected in real time using artificial intelligence. A U.K.-based nonprofit called Conservation AI made this possible with the help of NVIDIA technology. Such use of AI can help track even the rarest, most reclusive Read article >
( 7
min )
Fellow Hunters, get ready! This GFN Thursday welcomes Capcom’s Monster Hunter Rise and the expansion Sunbreak to the cloud, arriving soon for members. Settle down for the weekend with 10 new games supported in the GeForce NOW library, including The Settlers: New Allies. Plus, Amsterdam and Ashburn are next to light up on the RTX Read article >
( 5
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
We’re clarifying how ChatGPT's behavior is shaped and our plans for improving that behavior, allowing more user customization, and getting more public input into our decision-making in these areas.
OpenAI’s mission is to ensure that artificial general intelligence (AGI)[1] benefits all of humanity.
( 6
min )
The goal of this paper is to make a strong point for the usage of dynamical
models when using reinforcement learning (RL) for feedback control of dynamical
systems governed by partial differential equations (PDEs). To breach the gap
between the immense promises we see in RL and the applicability in complex
engineering systems, the main challenges are the massive requirements in terms
of the training data, as well as the lack of performance guarantees. We present
a solution for the first issue using a data-driven surrogate model in the form
of a convolutional LSTM with actuation. We demonstrate that learning an
actuated model in parallel to training the RL agent significantly reduces the
total amount of required data sampled from the real system. Furthermore, we
show that iteratively updating the model is of major importance to avoid biases
in the RL training. Detailed ablation studies reveal the most important
ingredients of the modeling process. We use the chaotic Kuramoto-Sivashinsky
equation do demonstarte our findings.
( 2
min )
Due to the precautionary measures during the COVID-19 pandemic many
universities offered unproctored take-home exams. We propose methods to detect
potential collusion between students and apply our approach on event log data
from take-home exams during the pandemic. We find groups of students with
suspiciously similar exams. In addition, we compare our findings to a proctored
control group. By this, we establish a rule of thumb for evaluating which cases
are "outstandingly similar", i.e., suspicious cases.
( 2
min )
Recent applications of pattern recognition techniques on brain connectome
classification using functional connectivity (FC) neglect the non-Euclidean
topology and causal dynamics of brain connectivity across time. In this paper,
a deep probabilistic spatiotemporal framework developed based on variational
Bayes (DSVB) is proposed to learn time-varying topological structures in
dynamic brain FC networks for autism spectrum disorder (ASD) identification.
The proposed framework incorporates a spatial-aware recurrent neural network to
capture rich spatiotemporal patterns across dynamic FC networks, followed by a
fully-connected neural network to exploit these learned patterns for
subject-level classification. To overcome model overfitting on limited training
datasets, an adversarial training strategy is introduced to learn graph
embedding models that generalize well to unseen brain networks. Evaluation on
the ABIDE resting-state functional magnetic resonance imaging dataset shows
that our proposed framework significantly outperformed state-of-the-art methods
in identifying ASD. Dynamic FC analyses with DSVB learned embeddings reveal
apparent group difference between ASD and healthy controls in network profiles
and switching dynamics of brain states.
( 2
min )
Most works on the fairness of machine learning systems focus on the blind
optimization of common fairness metrics, such as Demographic Parity and
Equalized Odds. In this paper, we conduct a comparative study of several bias
mitigation approaches to investigate their behaviors at a fine grain, the
prediction level. Our objective is to characterize the differences between fair
models obtained with different approaches. With comparable performances in
fairness and accuracy, are the different bias mitigation approaches impacting a
similar number of individuals? Do they mitigate bias in a similar way? Do they
affect the same individuals when debiasing a model? Our findings show that bias
mitigation approaches differ a lot in their strategies, both in the number of
impacted individuals and the populations targeted. More surprisingly, we show
these results even apply for several runs of the same mitigation approach.
These findings raise questions about the limitations of the current group
fairness metrics, as well as the arbitrariness, hence unfairness, of the whole
debiasing process.
( 2
min )
The existence of external (``side'') semantic knowledge has been shown to
result in more expressive computational event models. To enable the use of side
information that may be noisy or missing, we propose a semi-supervised
information bottleneck-based discrete latent variable model. We reparameterize
the model's discrete variables with auxiliary continuous latent variables and a
light-weight hierarchical structure. Our model is learned to minimize the
mutual information between the observed data and optional side knowledge that
is not already captured by the new, auxiliary variables. We theoretically show
that our approach generalizes past approaches, and perform an empirical case
study of our approach on event modeling. We corroborate our theoretical results
with strong empirical experiments, showing that the proposed method outperforms
previous proposed approaches on multiple datasets.
( 2
min )
This paper examines the separation of wireless communication and radar
signals, thereby guaranteeing cohabitation and acting as a panacea to spectrum
sensing. First, considering that the channel impulse response was known by the
receivers (communication and radar), we showed that the optimizing beamforming
weights mitigate the interference caused by signals and improve the physical
layer security (PLS) of the system. Furthermore, when the channel responses
were unknown, we designed an interference filter as a low-complex noise and
interference cancellation autoencoder. By mitigating the interference on the
legitimate users, the PLS was guaranteed. Results showed that even for a low
signal-to-noise ratio, the autoencoder produces low root-mean-square error
(RMSE) values.
( 2
min )
Intrigued by the claims of emergent reasoning capabilities in LLMs trained on
general web corpora, in this paper, we set out to investigate their planning
capabilities. We aim to evaluate (1) how good LLMs are by themselves in
generating and validating simple plans in commonsense planning tasks (of the
type that humans are generally quite good at) and (2) how good LLMs are in
being a source of heuristic guidance for other agents--either AI planners or
human planners--in their planning tasks. To investigate these questions in a
systematic rather than anecdotal manner, we start by developing a benchmark
suite based on the kinds of domains employed in the International Planning
Competition. On this benchmark, we evaluate LLMs in three modes: autonomous,
heuristic and human-in-the-loop. Our results show that LLM's ability to
autonomously generate executable plans is quite meager, averaging only about 3%
success rate. The heuristic and human-in-the-loop modes show slightly more
promise. In addition to these results, we also make our benchmark and
evaluation tools available to support investigations by research community.
( 2
min )
Artificial neural networks are being proposed as models of parts of the
brain. The networks are compared to recordings of biological neurons, and good
performance in reproducing neural responses is considered to support the
model's validity. A key question is how much this system identification
approach tells us about brain computation. Does it validate one model
architecture over another? We evaluate the most commonly used comparison
techniques, such as a linear encoding model and centered kernel alignment, to
correctly identify a model by replacing brain recordings with known ground
truth models. System identification performance is quite variable; it also
depends significantly on factors independent of the ground truth architecture,
such as stimuli images. In addition, we show the limitations of using
functional similarity scores in identifying higher-level architectural motifs.
( 2
min )
Bilevel Optimization has witnessed notable progress recently with new
emerging efficient algorithms, yet it is underexplored in the Federated
Learning setting. It is unclear how the challenges of Federated Learning affect
the convergence of bilevel algorithms. In this work, we study Federated Bilevel
Optimization problems. We first propose the FedBiO algorithm that solves the
hyper-gradient estimation problem efficiently, then we propose FedBiOAcc to
accelerate FedBiO. FedBiO has communication complexity $O(\epsilon^{-1.5})$
with linear speed up, while FedBiOAcc achieves communication complexity
$O(\epsilon^{-1})$, sample complexity $O(\epsilon^{-1.5})$ and also the linear
speed up. We also study Federated Bilevel Optimization problems with local
lower level problems, and prove that FedBiO and FedBiOAcc converges at the same
rate with some modification.
( 2
min )
Sequential monitoring of high-dimensional nonlinear time series is studied
for a projection of the second-moment matrix, a problem interesting in its own
right and specifically arising in finance and deep learning. Open-end as well
as closed-end monitoring is studied under mild assumptions on the training
sample and the observations of the monitoring period. Asymptotics is based on
Gaussian approximations of projected partial sums allowing for an estimated
projection vector. Estimation is studied both for classical
non-$\ell_0$-sparsity as well as under sparsity. For the case that the optimal
projection depends on the unknown covariance matrix, hard- and soft-thresholded
estimators are studied. Applications in finance and training of deep neural
networks are discussed. The proposed detectors typically allow to reduce
dramatically the required computational costs as illustrated by monitoring
synthetic data.
( 2
min )
We introduce a boosting algorithm to pre-process data for fairness. Starting
from an initial fair but inaccurate distribution, our approach shifts towards
better data fitting while still ensuring a minimal fairness guarantee. To do
so, it learns the sufficient statistics of an exponential family with
boosting-compliant convergence. Importantly, we are able to theoretically prove
that the learned distribution will have a representation rate and statistical
rate data fairness guarantee. Unlike recent optimization based pre-processing
methods, our approach can be easily adapted for continuous domain features.
Furthermore, when the weak learners are specified to be decision trees, the
sufficient statistics of the learned distribution can be examined to provide
clues on sources of (un)fairness. Empirical results are present to display the
quality of result on real-world data.
( 2
min )
Energy efficient navigation constitutes an important challenge in electric
vehicles, due to their limited battery capacity. We employ a Bayesian approach
to model the energy consumption at road segments for efficient navigation. In
order to learn the model parameters, we develop an online learning framework
and investigate several exploration strategies such as Thompson Sampling and
Upper Confidence Bound. We then extend our online learning framework to the
multi-agent setting, where multiple vehicles adaptively navigate and learn the
parameters of the energy model. We analyze Thompson Sampling and establish
rigorous regret bounds on its performance in the single-agent and multi-agent
settings, through an analysis of the algorithm under batched feedback. Finally,
we demonstrate the performance of our methods via experiments on several
real-world city road networks.
( 2
min )
We prove that the Minimum Description Length learning rule exhibits tempered
overfitting. We obtain tempered agnostic finite sample learning guarantees and
characterize the asymptotic behavior in the presence of random label noise.
( 2
min )
We study the convergence rate of discretized Riemannian Hamiltonian Monte
Carlo on sampling from distributions in the form of $e^{-f(x)}$ on a convex
body $\mathcal{M}\subset\mathbb{R}^{n}$. We show that for distributions in the
form of $e^{-\alpha^{\top}x}$ on a polytope with $m$ constraints, the
convergence rate of a family of commonly-used integrators is independent of
$\left\Vert \alpha\right\Vert _{2}$ and the geometry of the polytope. In
particular, the implicit midpoint method (IMM) and the generalized Leapfrog
method (LM) have a mixing time of $\widetilde{O}\left(mn^{3}\right)$ to achieve
$\epsilon$ total variation distance to the target distribution. These
guarantees are based on a general bound on the convergence rate for densities
of the form $e^{-f(x)}$ in terms of parameters of the manifold and the
integrator. Our theoretical guarantee complements the empirical results of
[KLSV22], which shows that RHMC with IMM can sample ill-conditioned, non-smooth
and constrained distributions in very high dimension efficiently in practice.
( 2
min )
Monotonic linear interpolation (MLI) - on the line connecting a random
initialization with the minimizer it converges to, the loss and accuracy are
monotonic - is a phenomenon that is commonly observed in the training of neural
networks. Such a phenomenon may seem to suggest that optimization of neural
networks is easy. In this paper, we show that the MLI property is not
necessarily related to the hardness of optimization problems, and empirical
observations on MLI for deep neural networks depend heavily on biases. In
particular, we show that interpolating both weights and biases linearly leads
to very different influences on the final output, and when different classes
have different last-layer biases on a deep network, there will be a long
plateau in both the loss and accuracy interpolation (which existing theory of
MLI cannot explain). We also show how the last-layer biases for different
classes can be different even on a perfectly balanced dataset using a simple
model. Empirically we demonstrate that similar intuitions hold on practical
networks and realistic datasets.
( 2
min )
In this paper, we interpret disentanglement as the discovery of local charts
and trace how that definition naturally leads to an equivalent condition for
disentanglement: the disentangled factors must commute with each other. We
discuss the practical and theoretical implications of commutativity, in
particular the compression and disentanglement of generative models. Finally,
we conclude with a discussion of related approaches to disentanglement and how
they relate to our view of disentanglement from the manifold perspective.
( 2
min )
We introduce a method for embedding graphs as vectors in a
structure-preserving manner, showcasing its rich representational capacity and
giving some theoretical properties. Our procedure falls under the bind-and-sum
approach, and we show that our binding operation - the tensor product - is the
most general binding operation that respects the principle of superposition. We
also establish some precise results characterizing the behavior of our method,
and we show that our use of spherical codes achieves a packing upper bound.
Then, we perform experiments showcasing our method's accuracy in various graph
operations even when the number of edges is quite large. Finally, we establish
a link to adjacency matrices, showing that our method is, in some sense, a
generalization of adjacency matrices with applications towards large sparse
graphs.
( 2
min )
In this paper, we study bottleneck identification in networks via extracting
minimax paths. Many real-world networks have stochastic weights for which full
knowledge is not available in advance. Therefore, we model this task as a
combinatorial semi-bandit problem to which we apply a combinatorial version of
Thompson Sampling and establish an upper bound on the corresponding Bayesian
regret. Due to the computational intractability of the problem, we then devise
an alternative problem formulation which approximates the original objective.
Finally, we experimentally evaluate the performance of Thompson Sampling with
the approximate formulation on real-world directed and undirected networks.
( 2
min )
Data pruning algorithms are commonly used to reduce the memory and
computational cost of the optimization process. Recent empirical results reveal
that random data pruning remains a strong baseline and outperforms most
existing data pruning methods in the high compression regime, i.e., where a
fraction of $30\%$ or less of the data is kept. This regime has recently
attracted a lot of interest as a result of the role of data pruning in
improving the so-called neural scaling laws; in [Sorscher et al.], the authors
showed the need for high-quality data pruning algorithms in order to beat the
sample power law.
In this work, we focus on score-based data pruning algorithms and show
theoretically and empirically why such algorithms fail in the high compression
regime. We demonstrate ``No Free Lunch" theorems for data pruning and present
calibration protocols that enhance the performance of existing pruning
algorithms in this high compression regime using randomization.
( 2
min )
Diffusion models achieve state-of-the-art performance in various generation
tasks. However, their theoretical foundations fall far behind. This paper
studies score approximation, estimation, and distribution recovery of diffusion
models, when data are supported on an unknown low-dimensional linear subspace.
Our result provides sample complexity bounds for distribution estimation using
diffusion models. We show that with a properly chosen neural network
architecture, the score function can be both accurately approximated and
efficiently estimated. Furthermore, the generated distribution based on the
estimated score function captures the data geometric structures and converges
to a close vicinity of the data distribution. The convergence rate depends on
the subspace dimension, indicating that diffusion models can circumvent the
curse of data ambient dimensionality.
( 2
min )
We propose new limiting dynamics for stochastic gradient descent in the small
learning rate regime called stochastic modified flows. These SDEs are driven by
a cylindrical Brownian motion and improve the so-called stochastic modified
equations by having regular diffusion coefficients and by matching the
multi-point statistics. As a second contribution, we introduce distribution
dependent stochastic modified flows which we prove to describe the fluctuating
limiting dynamics of stochastic gradient descent in the small learning rate -
infinite width scaling regime.
( 2
min )
Most works on the fairness of machine learning systems focus on the blind
optimization of common fairness metrics, such as Demographic Parity and
Equalized Odds. In this paper, we conduct a comparative study of several bias
mitigation approaches to investigate their behaviors at a fine grain, the
prediction level. Our objective is to characterize the differences between fair
models obtained with different approaches. With comparable performances in
fairness and accuracy, are the different bias mitigation approaches impacting a
similar number of individuals? Do they mitigate bias in a similar way? Do they
affect the same individuals when debiasing a model? Our findings show that bias
mitigation approaches differ a lot in their strategies, both in the number of
impacted individuals and the populations targeted. More surprisingly, we show
these results even apply for several runs of the same mitigation approach.
These findings raise questions about the limitations of the current group
fairness metrics, as well as the arbitrariness, hence unfairness, of the whole
debiasing process.
( 2
min )
We study the problem of discrete distribution estimation in KL divergence and
provide concentration bounds for the Laplace estimator. We show that the
deviation from mean scales as $\sqrt{k}/n$ when $n \ge k$, improving upon the
best prior result of $k/n$. We also establish a matching lower bound that shows
that our bounds are tight up to polylogarithmic factors.
( 2
min )
Machine-learned coarse-grained (CG) models have the potential for simulating
large molecular complexes beyond what is possible with atomistic molecular
dynamics. However, training accurate CG models remains a challenge. A widely
used methodology for learning CG force-fields maps forces from all-atom
molecular dynamics to the CG representation and matches them with a CG
force-field on average. We show that there is flexibility in how to map
all-atom forces to the CG representation, and that the most commonly used
mapping methods are statistically inefficient and potentially even incorrect in
the presence of constraints in the all-atom simulation. We define an
optimization statement for force mappings and demonstrate that substantially
improved CG force-fields can be learned from the same simulation data when
using optimized force maps. The method is demonstrated on the miniproteins
Chignolin and Tryptophan Cage and published as open-source code.
( 2
min )
Hyperbolic spaces have been quite popular in the recent past for representing
hierarchically organized data. Further, several classification algorithms for
data in these spaces have been proposed in the literature. These algorithms
mainly use either hyperplanes or geodesics for decision boundaries in a large
margin classifiers setting leading to a non-convex optimization problem. In
this paper, we propose a novel large margin classifier based on horocycle
(horosphere) decision boundaries that leads to a geodesically convex
optimization problem that can be optimized using any Riemannian gradient
descent technique guaranteeing a globally optimal solution. We present several
experiments depicting the performance of our classifier.
( 2
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
Hello everyone. I am a software engineering assistant professor at a private university. I have got lots of older lecture videos on my channel.
I am using NVIDIA broadcast to remove noise and it works very well.
However, I want to improve audio quality as well.
After doing a lot of research I found that audio super-resolution is the way to go
The only github repo I have found so far not working
Any help is appreciated
How can I improve speech quality?
Here my example lecture video (noise removed already - reuploaded - but sound is not good)
C# Programming For Beginners - Lecture 2: Coding our First Application in .NET Core Console
https://youtu.be/XLsrsCCdSnU
submitted by /u/CeFurkan
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/cryfi
[link] [comments]
( 42
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 41
min )
submitted by /u/Tiege
[link] [comments]
( 40
min )
submitted by /u/Groudon466
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/Dalembert
[link] [comments]
( 41
min )
submitted by /u/TheMysteriousMrM
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/RushingRobotics_com
[link] [comments]
( 41
min )
submitted by /u/Risz1
[link] [comments]
( 41
min )
submitted by /u/alotmorealots
[link] [comments]
( 45
min )
submitted by /u/CeFurkan
[link] [comments]
( 41
min )
Hello everyone. I am a software engineering assistant professor at a private university. I have got lots of older lecture videos on my channel.
I am using NVIDIA broadcast to remove noise and it works very well.
However, I want to improve audio quality as well.
After doing a lot of research I found that audio super-resolution is the way to go
The only github repo I have found so far not working
Any help is appreciated
How can I improve speech quality?
Here my example lecture video (noise removed already - reuploaded - but sound is not good)
C# Programming For Beginners - Lecture 2: Coding our First Application in .NET Core Console
https://youtu.be/XLsrsCCdSnU
submitted by /u/CeFurkan
[link] [comments]
( 42
min )
Amazon SageMaker JumpStart is the machine learning (ML) hub of SageMaker that offers over 350 built-in algorithms, pre-trained models, and pre-built solution templates to help you get started with ML fast. JumpStart provides one-click access to a wide variety of pre-trained models for common ML tasks such as object detection, text classification, summarization, text generation […]
( 11
min )
submitted by /u/robotphilanthropist
[link] [comments]
( 41
min )
Here is a podcast episode with Noam Brown from Meta AI where we discuss his work on achieving human-level performance on poker and Diplomacy, as well as the power of spending compute at inference time!
submitted by /u/thejashGI
[link] [comments]
( 41
min )
AI-augmented applications, photorealistic rendering, simulation and other technologies are helping professionals achieve business-critical results from multi-app workflows faster than ever. Running these data-intensive, complex workflows, as well as sharing data and collaborating across geographically dispersed teams, requires workstations with high-end CPUs, GPUs and advanced networking. To help meet these demands, Intel and NVIDIA are powering Read article >
( 6
min )
Whether creating realistic digital humans that can express emotion or building immersive virtual worlds, 3D artists can reach new heights with NVIDIA Omniverse, a platform for creating and operating metaverse applications. A new Blender alpha release, now available in the Omniverse Launcher, lets users of the 3D graphics software optimize scenes and streamline workflows with Read article >
( 5
min )
Surfers, swimmers and beachgoers face a hidden danger in the ocean: rip currents. These narrow channels of water can flow away from the shore at speeds up to 2.5 meters per second, making them one of the biggest safety risks for those enjoying the ocean. To help keep beachgoers safe, Christo Rautenbach, a coastal and Read article >
( 4
min )
One of the primary goals in spectrum occupancy mapping is to create a system
that is robust to assumptions about the number of sensors, occupancy threshold
(in dBm), sensor noise, number of emitters and the propagation environment. We
show that such a system may be designed with neural networks using a process of
aggregation to allow a variable number of sensors during training and testing.
This process transforms the variable number of measurements into approximate
log-likelihood ratios (LLRs), which are fed as a fixed-resolution image into a
neural network. The use of LLR's provides robustness to the effects of noise
and occupancy threshold. In other words, a system may be trained for a nominal
number of sensors, threshold and noise levels, and still operate well at
various other levels without retraining. Our system operates without knowledge
of the number of emitters and does not explicitly attempt to estimate their
number or power. Receiver operating curves with realistic propagation
environments using topographic maps with commercial network design tools show
how performance of the neural network varies with the environment. The use of
very low-resolution sensors in this system can still yield good performance.
( 2
min )
Q-learning and SARSA with $\epsilon$-greedy exploration are leading
reinforcement learning methods. Their tabular forms converge to the optimal
Q-function under reasonable conditions. However, with function approximation,
these methods exhibit strange behaviors such as policy oscillation, chattering,
and convergence to different attractors (possibly even the worst policy) on
different runs, apart from the usual instability. A theory to explain these
phenomena has been a long-standing open problem, even for basic linear function
approximation (Sutton, 1999). Our work uses differential inclusion to provide
the first framework for resolving this problem. We also provide numerical
examples to illustrate our framework's prowess in explaining these algorithms'
behaviors.
( 2
min )
Approximating Stochastic Gradient Descent (SGD) as a Stochastic Differential
Equation (SDE) has allowed researchers to enjoy the benefits of studying a
continuous optimization trajectory while carefully preserving the stochasticity
of SGD. Analogous study of adaptive gradient methods, such as RMSprop and Adam,
has been challenging because there were no rigorously proven SDE approximations
for these methods. This paper derives the SDE approximations for RMSprop and
Adam, giving theoretical guarantees of their correctness as well as
experimental validation of their applicability to common large-scaling vision
and language settings. A key practical result is the derivation of a
$\textit{square root scaling rule}$ to adjust the optimization hyperparameters
of RMSprop and Adam when changing batch size, and its empirical validation in
deep learning settings.
( 2
min )
We provide a first finite-particle convergence rate for Stein variational
gradient descent (SVGD). Specifically, whenever the target distribution is
sub-Gaussian with a Lipschitz score, SVGD with n particles and an appropriate
step size sequence drives the kernel Stein discrepancy to zero at an order
1/sqrt(log log n) rate. We suspect that the dependence on n can be improved,
and we hope that our explicit, non-asymptotic proof strategy will serve as a
template for future refinements.
( 2
min )
Despite the impressive successes of deep learning approaches for various
chemical problems such as property prediction, virtual screening, and de novo
molecule design, separately designed models for specific tasks are usually
required, and it is often difficult to synergistically combine these models for
novel tasks. To address this, here we present a bidirectional molecular
foundation model that can be used for both molecular structure and property
inferences through a single model, inspired by recent multimodal learning
methods such as VLP. Furthermore, thanks to the outstanding structure/property
alignment in a common embedding space, experimental results confirm that our
method leads to state-of-the-art performance and interpretable attention maps
in both multimodal and unimodal tasks, including conditional molecule
generation, property prediction, molecule classification, and reaction
prediction.
( 2
min )
We survey a current, heated debate in the AI research community on whether
large pre-trained language models can be said to "understand" language -- and
the physical and social situations language encodes -- in any important sense.
We describe arguments that have been made for and against such understanding,
and key questions for the broader sciences of intelligence that have arisen in
light of these arguments. We contend that a new science of intelligence can be
developed that will provide insight into distinct modes of understanding, their
strengths and limitations, and the challenge of integrating diverse forms of
cognition.
( 2
min )
A formal write-up of the simple proof (1995) of the existence of calibrated
forecasts by the minimax theorem, which moreover shows that $N^3$ periods
suffice to guarantee a calibration error of at most $1/N$.
( 2
min )
We present ASR Bundestag, a dataset for automatic speech recognition in
German, consisting of 610 hours of aligned audio-transcript pairs for
supervised training as well as 1,038 hours of unlabeled audio snippets for
self-supervised learning, based on raw audio data and transcriptions from
plenary sessions and committee meetings of the German parliament. In addition,
we discuss utilized approaches for the automated creation of speech datasets
and assess the quality of the resulting dataset based on evaluations and
finetuning of a pre-trained state of the art model. We make the dataset
publicly available, including all subsets.
( 2
min )
We propose a new \textit{quadratic programming-based} method of approximating
a nonstandard density using a multivariate Gaussian density. Such nonstandard
densities usually arise while developing posterior samplers for unobserved
components models involving inequality constraints on the parameters. For
instance, Chan et al. (2016) provided a new model of trend inflation with
linear inequality constraints on the stochastic trend. We implemented the
proposed quadratic programming-based method for this model and compared it to
the existing approximation. We observed that the proposed method works as well
as the existing approximation in terms of the final trend estimates while
achieving gains in terms of sample efficiency.
( 2
min )
We develop a new approach to drifting games, a class of two-person games with
many applications to boosting and online learning settings. Our approach
involves (a) guessing an asymptotically optimal potential by solving an
associated partial differential equation (PDE); then (b) justifying the guess,
by proving upper and lower bounds on the final-time loss whose difference
scales like a negative power of the number of time steps. The proofs of our
potential-based upper bounds are elementary, using little more than Taylor
expansion. The proofs of our potential-based lower bounds are also elementary,
combining Taylor expansion with probabilistic or combinatorial arguments. Not
only is our approach more elementary, but we give new potentials and derive
corresponding upper and lower bounds that match each other in the asymptotic
regime.
( 2
min )
We study the convergence rate of discretized Riemannian Hamiltonian Monte
Carlo on sampling from distributions in the form of $e^{-f(x)}$ on a convex
body $\mathcal{M}\subset\mathbb{R}^{n}$. We show that for distributions in the
form of $e^{-\alpha^{\top}x}$ on a polytope with $m$ constraints, the
convergence rate of a family of commonly-used integrators is independent of
$\left\Vert \alpha\right\Vert _{2}$ and the geometry of the polytope. In
particular, the implicit midpoint method (IMM) and the generalized Leapfrog
method (LM) have a mixing time of $\widetilde{O}\left(mn^{3}\right)$ to achieve
$\epsilon$ total variation distance to the target distribution. These
guarantees are based on a general bound on the convergence rate for densities
of the form $e^{-f(x)}$ in terms of parameters of the manifold and the
integrator. Our theoretical guarantee complements the empirical results of
[KLSV22], which shows that RHMC with IMM can sample ill-conditioned, non-smooth
and constrained distributions in very high dimension efficiently in practice.
( 2
min )
The COVID-19 pandemic has significantly impacted the construction sector,
which is sensitive to economic cycles. In order to boost value and efficiency
in this sector, the use of innovative exploration technologies such as
ultrasonic and Artificial Intelligence techniques in building material research
is becoming increasingly crucial. In this study, we developed two models for
predicting the Los Angeles (LA) and Micro Deval (MDE) coefficients, two
important geotechnical tests used to determine the quality of rock aggregates.
These coefficients describe the resistance of aggregates to fragmentation and
abrasion. The ultrasound velocity, porosity, and density of the rocks were
determined and used as inputs to develop prediction models using multiple
regression and an artificial neural network. These models may be used to assess
the quality of rock aggregates at the exploration stage without the need for
tedious laboratory analysis.
( 2
min )
Despite all the benefits of automated hyperparameter optimization (HPO), most
modern HPO algorithms are black-boxes themselves. This makes it difficult to
understand the decision process which leads to the selected configuration,
reduces trust in HPO, and thus hinders its broad adoption. Here, we study the
combination of HPO with interpretable machine learning (IML) methods such as
partial dependence plots. These techniques are more and more used to explain
the marginal effect of hyperparameters on the black-box cost function or to
quantify the importance of hyperparameters. However, if such methods are
naively applied to the experimental data of the HPO process in a post-hoc
manner, the underlying sampling bias of the optimizer can distort
interpretations. We propose a modified HPO method which efficiently balances
the search for the global optimum w.r.t. predictive performance \emph{and} the
reliable estimation of IML explanations of an underlying black-box function by
coupling Bayesian optimization and Bayesian Algorithm Execution. On benchmark
cases of both synthetic objectives and HPO of a neural network, we demonstrate
that our method returns more reliable explanations of the underlying black-box
without a loss of optimization performance.
( 2
min )
This manuscript investigates the one-pass stochastic gradient descent (SGD)
dynamics of a two-layer neural network trained on Gaussian data and labels
generated by a similar, though not necessarily identical, target function. We
rigorously analyse the limiting dynamics via a deterministic and
low-dimensional description in terms of the sufficient statistics for the
population risk. Our unifying analysis bridges different regimes of interest,
such as the classical gradient-flow regime of vanishing learning rate, the
high-dimensional regime of large input dimension, and the overparameterised
"mean-field" regime of large network width, covering as well the intermediate
regimes where the limiting dynamics is determined by the interplay between
these behaviours. In particular, in the high-dimensional limit, the
infinite-width dynamics is found to remain close to a low-dimensional subspace
spanned by the target principal directions. Our results therefore provide a
unifying picture of the limiting SGD dynamics with synthetic data.
( 2
min )
This paper empirically studies commonly observed training difficulties of
Physics-Informed Neural Networks (PINNs) on dynamical systems. Our results
indicate that fixed points which are inherent to these systems play a key role
in the optimization of the in PINNs embedded physics loss function. We observe
that the loss landscape exhibits local optima that are shaped by the presence
of fixed points. We find that these local optima contribute to the complexity
of the physics loss optimization which can explain common training difficulties
and resulting nonphysical predictions. Under certain settings, e.g., initial
conditions close to fixed points or long simulations times, we show that those
optima can even become better than that of the desired solution.
( 2
min )
We describe a parametrized space for simple meta-reinforcement-learning
(meta-RL) tasks with arbitrary stimuli. The parametrization allows us to
randomly generate an arbitrary number of novel simple meta-learning tasks. The
space of meta-RL tasks covered by this parametrization includes many well-known
meta-RL tasks, such as bandit tasks, the Harlow task, T-mazes, the Daw two-step
task and others. Simple extensions allow it to capture tasks based on
two-dimensional topological spaces, such as find-the-spot or key-door tasks. We
describe a number of randomly generated meta-RL tasks and discuss potential
issues arising from random generation.
( 2
min )
Advances in neural modeling have achieved state-of-the-art (SOTA) results on
public natural language processing (NLP) benchmarks, at times surpassing human
performance. However, there is a gap between public benchmarks and real-world
applications where noise, such as typographical or grammatical mistakes, is
abundant and can result in degraded performance. Unfortunately, works which
evaluate the robustness of neural models on noisy data and propose
improvements, are limited to the English language. Upon analyzing noise in
different languages, we observe that noise types vary greatly across languages.
Thus, existing investigations do not generalize trivially to multilingual
settings. To benchmark the performance of pretrained multilingual language
models, we construct noisy datasets covering five languages and four NLP tasks
and observe a clear gap in the performance between clean and noisy data in the
zero-shot cross-lingual setting. After investigating several ways to boost the
robustness of multilingual models in this setting, we propose Robust
Contrastive Pretraining (RCP). RCP combines data augmentation with a
contrastive loss term at the pretraining stage and achieves large improvements
on noisy (and original test data) across two sentence-level (+3.2%) and two
sequence-labeling (+10 F1-score) multilingual classification tasks.
( 2
min )
As advertisers increasingly shift their budgets toward digital advertising,
forecasting advertising costs is essential for making budget plans to optimize
marketing campaign returns. In this paper, we perform a comprehensive study
using a variety of time-series forecasting methods to predict daily average
cost-per-click (CPC) in the online advertising market. We show that forecasting
advertising costs would benefit from multivariate models using covariates from
competitors' CPC development identified through time-series clustering. We
further interpret the results by analyzing feature importance and temporal
attention. Finally, we show that our approach has several advantages over
models that individual advertisers might build based solely on their collected
data.
( 2
min )
We motivate and introduce CHARD: Clinical Health-Aware Reasoning across
Dimensions, to investigate the capability of text generation models to act as
implicit clinical knowledge bases and generate free-flow textual explanations
about various health-related conditions across several dimensions. We collect
and present an associated dataset, CHARDat, consisting of explanations about 52
health conditions across three clinical dimensions. We conduct extensive
experiments using BART and T5 along with data augmentation, and perform
automatic, human, and qualitative analyses. We show that while our models can
perform decently, CHARD is very challenging with strong potential for further
exploration.
( 2
min )
In addition to the weights of synaptic shared connections, PNN includes
weights of synaptic effective ranges [14-24]. PNN considers synaptic strength
balance in dynamic of phagocytosing of synapses and static of constant sum of
synapses length [14], and includes the lead behavior of the school of fish.
Synapse formation will inhibit dendrites generation to a certain extent in
experiments and PNN simulations [15]. The memory persistence gradient of
retrograde circuit similar to the Enforcing Resilience in a Spring Boot. The
relatively good and inferior gradient information stored in memory engram cells
in synapse formation of retrograde circuit like the folds of the brain [16].
The controversy was claimed if human hippocampal neurogenesis persists
throughout aging, PNN considered it may have a new and longer circuit in late
iteration [17,18]. Closing the critical period will cause neurological disorder
in experiments and PNN simulations [19]. Considering both negative and positive
memories persistence help activate synapse length changes with iterations
better than only considering positive memory [20]. Astrocytic phagocytosis will
avoid the local accumulation of synapses by simulation, Lack of astrocytic
phagocytosis causes excitatory synapses and functionally impaired synapses
accumulate in experiments and lead to destruction of cognition, but local
longer synapses and worse results in PNN simulations [21]. It gives
relationship of intelligence and cortical thickness, individual differences in
brain [22]. The PNN also considered the memory engram cells that strengthened
Synaptic strength [23]. The effects of PNN's memory structure and tPBM may be
the same for powerful penetrability of signals [24]. Memory persistence also
inhibit local synaptic accumulation. By PNN, it may introduce the relatively
good and inferior solution in PSO. The simple PNN only has the synaptic
phagocytosis.
( 3
min )
Reinforcement learning is an effective way to solve the decision-making
problems. It is a meaningful and valuable direction to investigate autonomous
air combat maneuver decision-making method based on reinforcement learning.
However, when using reinforcement learning to solve the decision-making
problems with sparse rewards, such as air combat maneuver decision-making, it
costs too much time for training and the performance of the trained agent may
not be satisfactory. In order to solve these problems, the method based on
curriculum learning is proposed. First, three curricula of air combat maneuver
decision-making are designed: angle curriculum, distance curriculum and hybrid
curriculum. These courses are used to train air combat agents respectively, and
compared with the original method without any curriculum. The training results
show that angle curriculum can increase the speed and stability of training,
and improve the performance of the agent; distance curriculum can increase the
speed and stability of agent training; hybrid curriculum has a negative impact
on training, because it makes the agent get stuck at local optimum. The
simulation results show that after training, the agent can handle the
situations where targets come from different directions, and the maneuver
decision results are consistent with the characteristics of missile.
( 2
min )
Traffic signal control is safety-critical for our daily life. Roughly
one-quarter of road accidents in the U.S. happen at intersections due to
problematic signal timing, urging the development of safety-oriented
intersection control. However, existing studies on adaptive traffic signal
control using reinforcement learning technologies have focused mainly on
minimizing traffic delay but neglecting the potential exposure to unsafe
conditions. We, for the first time, incorporate road safety standards as
enforcement to ensure the safety of existing reinforcement learning methods,
aiming toward operating intersections with zero collisions. We have proposed a
safety-enhanced residual reinforcement learning method (SafeLight) and employed
multiple optimization techniques, such as multi-objective loss function and
reward shaping for better knowledge integration. Extensive experiments are
conducted using both synthetic and real-world benchmark datasets. Results show
that our method can significantly reduce collisions while increasing traffic
mobility.
( 2
min )
This work studies discrete diffusion probabilistic models with applications
to natural language generation. We derive an alternative yet equivalent
formulation of the sampling from discrete diffusion processes and leverage this
insight to develop a family of reparameterized discrete diffusion models. The
derived generic framework is highly flexible, offers a fresh perspective of the
generation process in discrete diffusion models, and features more effective
training and decoding techniques. We conduct extensive experiments to evaluate
the text generation capability of our model, demonstrating significant
improvements over existing diffusion models.
( 2
min )
We consider the problem of learning multioutput function classes in batch and
online settings. In both settings, we show that a multioutput function class is
learnable if and only if each single-output restriction of the function class
is learnable. This provides a complete characterization of the learnability of
multilabel classification and multioutput regression in both batch and online
settings. As an extension, we also consider multilabel learnability in the
bandit feedback setting and show a similar characterization as in the
full-feedback setting.
( 2
min )
In this paper, we extend the Wiener-Ito chaos decomposition to the class of
diffusion processes, whose drift and diffusion coefficient are of linear
growth. By omitting the orthogonality in the chaos expansion, we are able to
show that every $p$-integrable functional, for $p \in [1,\infty)$, can be
represented as sum of iterated integrals of the underlying process. Using a
truncated sum of this expansion and (possibly random) neural networks for the
integrands, whose parameters are learned in a machine learning setting, we show
that every financial derivative can be approximated arbitrarily well in the
$L^p$-sense. Since the hedging strategy of the approximating option can be
computed in closed form, we obtain an efficient algorithm that can replicate
any integrable financial derivative with short runtime.
( 2
min )
The tremendous growth in smart devices has uplifted several security threats.
One of the most prominent threats is malicious software also known as malware.
Malware has the capability of corrupting a device and collapsing an entire
network. Therefore, its early detection and mitigation are extremely important
to avoid catastrophic effects. In this work, we came up with a solution for
malware detection using state-of-the-art natural language processing (NLP)
techniques. Our main focus is to provide a lightweight yet effective classifier
for malware detection which can be used for heterogeneous devices, be it a
resource constraint device or a resourceful machine. Our proposed model is
tested on the benchmark data set with an accuracy and log loss score of 99.13
percent and 0.04 respectively.
( 2
min )
Motivated by neural network training in low-bit floating and fixed-point
environments, this work studies the convergence of variants of SGD with
computational error. Considering a general stochastic Lipschitz continuous loss
function, a novel convergence result to a Clarke stationary point is presented
assuming that only an approximation of its stochastic gradient can be computed
as well as error in computing the SGD step itself. Different variants of SGD
are then tested empirically in a variety of low-precision arithmetic
environments, where improved test set accuracy is observed compared to SGD for
two image recognition tasks.
( 2
min )
Gradient descent methods have long been the de facto standard for training
deep neural networks. Millions of training samples are fed into models with
billions of parameters, which are slowly updated over hundreds of epochs.
Recently, it's been shown that large, randomly initialized neural networks
contain subnetworks that perform as well as fully trained models. This insight
offers a promising avenue for training future neural networks by simply pruning
weights from large, random models. However, this problem is combinatorically
hard and classical algorithms are not efficient at finding the best subnetwork.
In this paper, we explore how quantum algorithms could be formulated and
applied to this neuron selection problem. We introduce several methods for
local quantum neuron selection that reduce the entanglement complexity that
large scale neuron selection would require, making this problem more tractable
for current quantum hardware.
( 2
min )
Text-based game environments are challenging because agents must deal with
long sequences of text, execute compositional actions using text and learn from
sparse rewards. We address these challenges by proposing Long-Context Language
Decision Transformers (LLDTs), a framework that is based on long transformer
language models and decision transformers (DTs). LLDTs extend DTs with 3
components: (1) exponential tilt to guide the agent towards high obtainable
goals, (2) novel goal conditioning methods yielding significantly better
results than the traditional return-to-go (sum of all future rewards), and (3)
a model of future observations. Our ablation results show that predicting
future observations improves agent performance. To the best of our knowledge,
LLDTs are the first to address offline RL with DTs on these challenging games.
Our experiments show that LLDTs achieve the highest scores among many different
types of agents on some of the most challenging Jericho games, such as
Enchanter.
( 2
min )
Graph Neural Networks (GNNs) have achieved much success on graph-structured
data. In light of this, there have been increasing interests in studying their
expressive power. One line of work studies the capability of GNNs to
approximate permutation-invariant functions on graphs, and another focuses on
the their power as tests for graph isomorphism. Our work connects these two
perspectives and proves their equivalence. We further develop a framework of
the expressive power of GNNs that incorporates both of these viewpoints using
the language of sigma-algebra, through which we compare the expressive power of
different types of GNNs together with other graph isomorphism tests. In
particular, we prove that the second-order Invariant Graph Network fails to
distinguish non-isomorphic regular graphs with the same degree. Then, we extend
it to a new architecture, Ring-GNN, which succeeds in distinguishing these
graphs and achieves good performances on real-world datasets.
( 2
min )
Recently, \cite{montasser2019vc} showed that finite VC dimension is not
sufficient for \textit{proper} adversarially robust PAC learning. In light of
this hardness result, there is a growing effort to study what type of
relaxations to the adversarially robust PAC learning setup can enable proper
learnability. In this work, we initiate the study of proper learning under
relaxations of the worst-case robust loss. We give a family of robust loss
relaxations under which VC classes are properly PAC learning with sample
complexity close to what one would require in the standard PAC learning setup.
On the other hand, we show that for an existing and natural relaxation of the
worst-case robust loss, finite VC dimension is not sufficient for proper
learning. Lastly, we give new generalization guarantees for the adversarially
robust empirical risk minimizer.
( 2
min )
We prove a convergence theorem for U-statistics of degree two, where the data
dimension $d$ is allowed to scale with sample size $n$. We find that the
limiting distribution of a U-statistic undergoes a phase transition from the
non-degenerate Gaussian limit to the degenerate limit, regardless of its
degeneracy and depending only on a moment ratio. A surprising consequence is
that a non-degenerate U-statistic in high dimensions can have a non-Gaussian
limit with a larger variance and asymmetric distribution. Our bounds are valid
for any finite $n$ and $d$, independent of individual eigenvalues of the
underlying function, and dimension-independent under a mild assumption. As an
application, we apply our theory to two popular kernel-based distribution
tests, MMD and KSD, whose high-dimensional performance has been challenging to
study. In a simple empirical setting, our results correctly predict how the
test power at a fixed threshold scales with $d$ and the bandwidth.
( 2
min )
Deep neural networks (DNN) have shown great capacity of modeling a dynamical
system; nevertheless, they usually do not obey physics constraints such as
conservation laws. This paper proposes a new learning framework named ConCerNet
to improve the trustworthiness of the DNN based dynamics modeling to endow the
invariant properties. ConCerNet consists of two steps: (i) a contrastive
learning method to automatically capture the system invariants (i.e.
conservation properties) along the trajectory observations; (ii) a neural
projection layer to guarantee that the learned dynamics models preserve the
learned invariants. We theoretically prove the functional relationship between
the learned latent representation and the unknown system invariant function.
Experiments show that our method consistently outperforms the baseline neural
networks in both coordinate error and conservation metrics by a large margin.
With neural network based parameterization and no dependence on prior
knowledge, our method can be extended to complex and large-scale dynamics by
leveraging an autoencoder.
( 2
min )
In this work, we consider the stochastic optimal control problem in
continuous time and a policy gradient method to solve it. In particular, we
study the gradient flow for the control, viewed as a continuous time limit of
the policy gradient. We prove the global convergence of the gradient flow and
establish a convergence rate under some regularity assumptions. The main
novelty in the analysis is the notion of local optimal control function, which
is introduced to compare the local optimality of the iterate.
( 2
min )
Human-robot interaction (HRI) research is progressively addressing
multi-party scenarios, where a robot interacts with more than one human user at
the same time. Conversely, research is still at an early stage for human-robot
collaboration (HRC). The use of machine learning techniques to handle such type
of collaboration requires data that are less feasible to produce than in a
typical HRC setup. This work outlines concepts of design of concurrent tasks
for non-dyadic HRC applications. Based upon these concepts, this study also
proposes an alternative way of gathering data regarding multiuser activity, by
collecting data related to single subjects and merging them in post-processing,
to reduce the effort involved in producing recordings of pair settings. To
validate this statement, 3D skeleton poses of activity of single subjects were
collected and merged in pairs. After this, the datapoints were used to
separately train a long short-term memory (LSTM) network and a variational
autoencoder (VAE) composed of spatio-temporal graph convolutional networks
(STGCN) to recognise the joint activities of the pairs of people. The results
showed that it is possible to make use of data collected in this way for pair
HRC settings and get similar performances compared to using data regarding
groups of users recorded under the same settings, relieving from the technical
difficulties involved in producing these data.
( 2
min )
In a recent paper, Ling et al. investigated the over-parametrized Deep
Equilibrium Model (DEQ) with ReLU activation and proved that the gradient
descent converges to a globally optimal solution at a linear convergence rate
for the quadratic loss function. In this paper, we show that this fact still
holds for DEQs with any general activation which has bounded first and second
derivatives. Since the new activation function is generally non-linear, a
general population Gram matrix is designed, and a new form of dual activation
with Hermite polynomial expansion is developed.
( 2
min )
Non-intrusive load monitoring (NILM) aims to decompose aggregated electrical
usage signal into appliance-specific power consumption and it amounts to a
classical example of blind source separation tasks. Leveraging recent progress
on deep learning techniques, we design a new neural NILM model Multi-State Dual
CNN (MSDC). Different from previous models, MSDC explicitly extracts
information about the appliance's multiple states and state transitions, which
in turn regulates the prediction of signals for appliances. More specifically,
we employ a dual-CNN architecture: one CNN for outputting state distributions
and the other for predicting the power of each state. A new technique is
invented that utilizes conditional random fields (CRF) to capture state
transitions. Experiments on two real-world datasets REDD and UK-DALE
demonstrate that our model significantly outperform state-of-the-art models
while having good generalization capacity, achieving 6%-10% MAE gain and
33%-51% SAE gain to unseen appliances.
( 2
min )
With the increased usage of artificial intelligence (AI), it is imperative to
understand how these models work internally. These needs have led to the
development of a new field called eXplainable artificial intelligence (XAI).
This field consists of on a set of techniques that allows us to theoretically
determine the cause of the AI decisions. One unsolved question about XAI is how
to measure the quality of explanations. In this study, we propose a new method
to generate datasets with ground truth (GT). These datasets allow us to measure
how faithful is a method without ad hoc solutions. We conducted a set of
experiments that compared our GT with real model explanations and obtained
excellent results confirming that our proposed method is correct.
( 2
min )
Engineering more secure software has become a critical challenge in the cyber
world. It is very important to develop methodologies, techniques, and tools for
developing secure software. To develop secure software, software developers
need to think like an attacker through mining software repositories. These aim
to analyze and understand the data repositories related to software
development. The main goal is to use these software repositories to support the
decision-making process of software development. There are different
vulnerability databases like Common Weakness Enumeration (CWE), Common
Vulnerabilities and Exposures database (CVE), and CAPEC. We utilized a database
called MITRE. MITRE ATT&CK tactics and techniques have been used in various
ways and methods, but tools for utilizing these tactics and techniques in the
early stages of the software development life cycle (SDLC) are lacking. In this
paper, we use machine learning algorithms to map requirements to the MITRE
ATT&CK database and determine the accuracy of each mapping depending on the
data split.
( 2
min )
Studies have shown that large pretrained language models exhibit biases
against social groups based on race, gender etc, which they inherit from the
datasets they are trained on. Various researchers have proposed mathematical
tools for quantifying and identifying these biases. There have been methods
proposed to mitigate such biases. In this paper, we present a comprehensive
quantitative evaluation of different kinds of biases such as race, gender,
ethnicity, age etc. exhibited by popular pretrained language models such as
BERT, GPT-2 etc. and also present a toolkit that provides plug-and-play
interfaces to connect mathematical tools to identify biases with large
pretrained language models such as BERT, GPT-2 etc. and also present users with
the opportunity to test custom models against these metrics. The toolkit also
allows users to debias existing and custom models using the debiasing
techniques proposed so far. The toolkit is available at
https://github.com/HrishikeshVish/Fairpy.
( 2
min )
Recent advances in instruction-following large language models (LLMs) have
led to dramatic improvements in a range of NLP tasks. Unfortunately, we find
that the same improved capabilities amplify the dual-use risks for malicious
purposes of these models. Dual-use is difficult to prevent as
instruction-following capabilities now enable standard attacks from computer
security. The capabilities of these instruction-following LLMs provide strong
economic incentives for dual-use by malicious actors. In particular, we show
that instruction-following LLMs can produce targeted malicious content,
including hate speech and scams, bypassing in-the-wild defenses implemented by
LLM API vendors. Our analysis shows that this content can be generated
economically and at cost likely lower than with human effort alone. Together,
our findings suggest that LLMs will increasingly attract more sophisticated
adversaries and attacks, and addressing these attacks may require new
approaches to mitigations.
( 2
min )
Modern NLP systems exhibit a range of biases, which a growing literature on
model debiasing attempts to correct. However current progress is hampered by a
plurality of definitions of bias, means of quantification, and oftentimes vague
relation between debiasing algorithms and theoretical measures of bias. This
paper seeks to clarify the current situation and plot a course for meaningful
progress in fair learning, with two key contributions: (1) making clear
inter-relations among the current gamut of methods, and their relation to
fairness theory; and (2) addressing the practical problem of model selection,
which involves a trade-off between fairness and accuracy and has led to
systemic issues in fairness research. Putting them together, we make several
recommendations to help shape future work.
( 2
min )
In recent days, the number of technology enthusiasts is increasing day by day
with the prevalence of technological products and easy access to the internet.
Similarly, the amount of people working behind this rapid development is rising
tremendously. Computer programmers consist of a large portion of those
tech-savvy people. Codeforces, an online programming and contest hosting
platform used by many competitive programmers worldwide. It is regarded as one
of the most standardized platforms for practicing programming problems and
participate in programming contests. In this research, we propose a framework
that predicts the performance of any particular contestant in the upcoming
competitions as well as predicts the rating after that contest based on their
practice and the performance of their previous contests.
( 2
min )
We present a novel momentum-based first order optimization method (AGNES)
which provably achieves acceleration for convex minimization, even if the
stochastic noise in the gradient estimates is many orders of magnitude larger
than the gradient itself. Here we model the noise as having a variance which is
proportional to the magnitude of the underlying gradient. We argue, based upon
empirical evidence, that this is appropriate for mini-batch gradients in
overparameterized deep learning. Furthermore, we demonstrate that the method
achieves competitive performance in the training of CNNs on MNIST and CIFAR-10.
( 2
min )
We consider the sequential decision-making problem where the mean outcome is
a non-linear function of the chosen action. Compared with the linear model, two
curious phenomena arise in non-linear models: first, in addition to the
"learning phase" with a standard parametric rate for estimation or regret,
there is an "burn-in period" with a fixed cost determined by the non-linear
function; second, achieving the smallest burn-in cost requires new exploration
algorithms. For a special family of non-linear functions named ridge functions
in the literature, we derive upper and lower bounds on the optimal burn-in
cost, and in addition, on the entire learning trajectory during the burn-in
period via differential equations. In particular, a two-stage algorithm that
first finds a good initial action and then treats the problem as locally linear
is statistically optimal. In contrast, several classical algorithms, such as
UCB and algorithms relying on regression oracles, are provably suboptimal.
( 2
min )
In this paper, we extend the Wiener-Ito chaos decomposition to the class of
diffusion processes, whose drift and diffusion coefficient are of linear
growth. By omitting the orthogonality in the chaos expansion, we are able to
show that every $p$-integrable functional, for $p \in [1,\infty)$, can be
represented as sum of iterated integrals of the underlying process. Using a
truncated sum of this expansion and (possibly random) neural networks for the
integrands, whose parameters are learned in a machine learning setting, we show
that every financial derivative can be approximated arbitrarily well in the
$L^p$-sense. Since the hedging strategy of the approximating option can be
computed in closed form, we obtain an efficient algorithm that can replicate
any integrable financial derivative with short runtime.
( 2
min )
We establish a dataset of over $1.6\times10^4$ experimental images of
Bose--Einstein condensates containing solitonic excitations to enable machine
learning (ML) for many-body physics research. About $33~\%$ of this dataset has
manually assigned and carefully curated labels. The remainder is automatically
labeled using SolDet -- an implementation of a physics-informed ML data
analysis framework -- consisting of a convolutional-neural-network-based
classifier and OD as well as a statistically motivated physics-informed
classifier and a quality metric. This technical note constitutes the definitive
reference of the dataset, providing an opportunity for the data science
community to develop more sophisticated analysis tools, to further understand
nonlinear many-body physics, and even advance cold atom experiments.
( 2
min )
We provide a first finite-particle convergence rate for Stein variational
gradient descent (SVGD). Specifically, whenever the target distribution is
sub-Gaussian with a Lipschitz score, SVGD with n particles and an appropriate
step size sequence drives the kernel Stein discrepancy to zero at an order
1/sqrt(log log n) rate. We suspect that the dependence on n can be improved,
and we hope that our explicit, non-asymptotic proof strategy will serve as a
template for future refinements.
( 2
min )
We consider the problem of learning multioutput function classes in batch and
online settings. In both settings, we show that a multioutput function class is
learnable if and only if each single-output restriction of the function class
is learnable. This provides a complete characterization of the learnability of
multilabel classification and multioutput regression in both batch and online
settings. As an extension, we also consider multilabel learnability in the
bandit feedback setting and show a similar characterization as in the
full-feedback setting.
( 2
min )
Despite all the benefits of automated hyperparameter optimization (HPO), most
modern HPO algorithms are black-boxes themselves. This makes it difficult to
understand the decision process which leads to the selected configuration,
reduces trust in HPO, and thus hinders its broad adoption. Here, we study the
combination of HPO with interpretable machine learning (IML) methods such as
partial dependence plots. These techniques are more and more used to explain
the marginal effect of hyperparameters on the black-box cost function or to
quantify the importance of hyperparameters. However, if such methods are
naively applied to the experimental data of the HPO process in a post-hoc
manner, the underlying sampling bias of the optimizer can distort
interpretations. We propose a modified HPO method which efficiently balances
the search for the global optimum w.r.t. predictive performance \emph{and} the
reliable estimation of IML explanations of an underlying black-box function by
coupling Bayesian optimization and Bayesian Algorithm Execution. On benchmark
cases of both synthetic objectives and HPO of a neural network, we demonstrate
that our method returns more reliable explanations of the underlying black-box
without a loss of optimization performance.
( 2
min )
Graph Neural Networks (GNNs) have achieved much success on graph-structured
data. In light of this, there have been increasing interests in studying their
expressive power. One line of work studies the capability of GNNs to
approximate permutation-invariant functions on graphs, and another focuses on
the their power as tests for graph isomorphism. Our work connects these two
perspectives and proves their equivalence. We further develop a framework of
the expressive power of GNNs that incorporates both of these viewpoints using
the language of sigma-algebra, through which we compare the expressive power of
different types of GNNs together with other graph isomorphism tests. In
particular, we prove that the second-order Invariant Graph Network fails to
distinguish non-isomorphic regular graphs with the same degree. Then, we extend
it to a new architecture, Ring-GNN, which succeeds in distinguishing these
graphs and achieves good performances on real-world datasets.
( 2
min )
We present a continuous-time probabilistic approach for estimating the chirp
signal and its instantaneous frequency function when the true forms of these
functions are not accessible. Our model represents these functions by
non-linearly cascaded Gaussian processes represented as non-linear stochastic
differential equations. The posterior distribution of the functions is then
estimated with stochastic filters and smoothers. We compute a (posterior)
Cram\'er--Rao lower bound for the Gaussian process model, and derive a
theoretical upper bound for the estimation error in the mean squared sense. The
experiments show that the proposed method outperforms a number of
state-of-the-art methods on a synthetic data. We also show that the method
works out-of-the-box for two real-world datasets.
( 2
min )
In addition to the weights of synaptic shared connections, PNN includes
weights of synaptic effective ranges [14-24]. PNN considers synaptic strength
balance in dynamic of phagocytosing of synapses and static of constant sum of
synapses length [14], and includes the lead behavior of the school of fish.
Synapse formation will inhibit dendrites generation to a certain extent in
experiments and PNN simulations [15]. The memory persistence gradient of
retrograde circuit similar to the Enforcing Resilience in a Spring Boot. The
relatively good and inferior gradient information stored in memory engram cells
in synapse formation of retrograde circuit like the folds of the brain [16].
The controversy was claimed if human hippocampal neurogenesis persists
throughout aging, PNN considered it may have a new and longer circuit in late
iteration [17,18]. Closing the critical period will cause neurological disorder
in experiments and PNN simulations [19]. Considering both negative and positive
memories persistence help activate synapse length changes with iterations
better than only considering positive memory [20]. Astrocytic phagocytosis will
avoid the local accumulation of synapses by simulation, Lack of astrocytic
phagocytosis causes excitatory synapses and functionally impaired synapses
accumulate in experiments and lead to destruction of cognition, but local
longer synapses and worse results in PNN simulations [21]. It gives
relationship of intelligence and cortical thickness, individual differences in
brain [22]. The PNN also considered the memory engram cells that strengthened
Synaptic strength [23]. The effects of PNN's memory structure and tPBM may be
the same for powerful penetrability of signals [24]. Memory persistence also
inhibit local synaptic accumulation. By PNN, it may introduce the relatively
good and inferior solution in PSO. The simple PNN only has the synaptic
phagocytosis.
( 3
min )
Recently, \cite{montasser2019vc} showed that finite VC dimension is not
sufficient for \textit{proper} adversarially robust PAC learning. In light of
this hardness result, there is a growing effort to study what type of
relaxations to the adversarially robust PAC learning setup can enable proper
learnability. In this work, we initiate the study of proper learning under
relaxations of the worst-case robust loss. We give a family of robust loss
relaxations under which VC classes are properly PAC learning with sample
complexity close to what one would require in the standard PAC learning setup.
On the other hand, we show that for an existing and natural relaxation of the
worst-case robust loss, finite VC dimension is not sufficient for proper
learning. Lastly, we give new generalization guarantees for the adversarially
robust empirical risk minimizer.
( 2
min )
A formal write-up of the simple proof (1995) of the existence of calibrated
forecasts by the minimax theorem, which moreover shows that $N^3$ periods
suffice to guarantee a calibration error of at most $1/N$.
( 2
min )
The limit of infinite width allows for substantial simplifications in the
analytical study of over-parameterised neural networks. With a suitable random
initialisation, an extremely large network exhibits an approximately Gaussian
behaviour. In the present work, we establish a similar result for a simple
stochastic architecture whose parameters are random variables, holding both
before and during training. The explicit evaluation of the output distribution
allows for a PAC-Bayesian training procedure that directly optimises the
generalisation bound. For a large but finite-width network, we show empirically
on MNIST that this training approach can outperform standard PAC-Bayesian
methods.
( 2
min )
Recently there is a rising interest in the research of mean field
optimization, in particular because of its role in analyzing the training of
neural networks. In this paper by adding the Fisher Information as the
regularizer, we relate the regularized mean field optimization problem to a
so-called mean field Schrodinger dynamics. We develop an energy-dissipation
method to show that the marginal distributions of the mean field Schrodinger
dynamics converge exponentially quickly towards the unique minimizer of the
regularized optimization problem. Remarkably, the mean field Schrodinger
dynamics is proved to be a gradient flow on the probability measure space with
respect to the relative entropy. Finally we propose a Monte Carlo method to
sample the marginal distributions of the mean field Schrodinger dynamics.
( 2
min )
We consider the task of representing signals supported on graph bundles,
which are generalizations of product graphs that allow for "twists" in the
product structure. Leveraging the localized product structure of a graph
bundle, we demonstrate how a suitable partition of unity over the base graph
can be used to lift the signal on the graph into a space where a product
factorization can be readily applied. Motivated by the locality of this
procedure, we demonstrate that bases for the signal spaces of the components of
the graph bundle can be lifted in the same way, yielding a basis for the signal
space of the total graph. We demonstrate this construction on synthetic graphs,
as well as with an analysis of the energy landscape of conformational manifolds
in stereochemistry.
( 2
min )
This manuscript investigates the one-pass stochastic gradient descent (SGD)
dynamics of a two-layer neural network trained on Gaussian data and labels
generated by a similar, though not necessarily identical, target function. We
rigorously analyse the limiting dynamics via a deterministic and
low-dimensional description in terms of the sufficient statistics for the
population risk. Our unifying analysis bridges different regimes of interest,
such as the classical gradient-flow regime of vanishing learning rate, the
high-dimensional regime of large input dimension, and the overparameterised
"mean-field" regime of large network width, covering as well the intermediate
regimes where the limiting dynamics is determined by the interplay between
these behaviours. In particular, in the high-dimensional limit, the
infinite-width dynamics is found to remain close to a low-dimensional subspace
spanned by the target principal directions. Our results therefore provide a
unifying picture of the limiting SGD dynamics with synthetic data.
( 2
min )
We present a novel momentum-based first order optimization method (AGNES)
which provably achieves acceleration for convex minimization, even if the
stochastic noise in the gradient estimates is many orders of magnitude larger
than the gradient itself. Here we model the noise as having a variance which is
proportional to the magnitude of the underlying gradient. We argue, based upon
empirical evidence, that this is appropriate for mini-batch gradients in
overparameterized deep learning. Furthermore, we demonstrate that the method
achieves competitive performance in the training of CNNs on MNIST and CIFAR-10.
( 2
min )
The Distributional Random Forest (DRF) is a recently introduced Random Forest
algorithm to estimate multivariate conditional distributions. Due to its
general estimation procedure, it can be employed to estimate a wide range of
targets such as conditional average treatment effects, conditional quantiles,
and conditional correlations. However, only results about the consistency and
convergence rate of the DRF prediction are available so far. We characterize
the asymptotic distribution of DRF and develop a bootstrap approximation of it.
This allows us to derive inferential tools for quantifying standard errors and
the construction of confidence regions that have asymptotic coverage
guarantees. In simulation studies, we empirically validate the developed theory
for inference of low-dimensional targets and for testing distributional
differences between two populations.
( 2
min )
We prove a convergence theorem for U-statistics of degree two, where the data
dimension $d$ is allowed to scale with sample size $n$. We find that the
limiting distribution of a U-statistic undergoes a phase transition from the
non-degenerate Gaussian limit to the degenerate limit, regardless of its
degeneracy and depending only on a moment ratio. A surprising consequence is
that a non-degenerate U-statistic in high dimensions can have a non-Gaussian
limit with a larger variance and asymmetric distribution. Our bounds are valid
for any finite $n$ and $d$, independent of individual eigenvalues of the
underlying function, and dimension-independent under a mild assumption. As an
application, we apply our theory to two popular kernel-based distribution
tests, MMD and KSD, whose high-dimensional performance has been challenging to
study. In a simple empirical setting, our results correctly predict how the
test power at a fixed threshold scales with $d$ and the bandwidth.
( 2
min )
Electronic health records (EHR) often contain sensitive medical information
about individual patients, posing significant limitations to sharing or
releasing EHR data for downstream learning and inferential tasks. We use
normalizing flows (NF), a family of deep generative models, to estimate the
probability density of a dataset with differential privacy (DP) guarantees,
from which privacy-preserving synthetic data are generated. We apply the
technique to an EHR dataset containing patients with pulmonary hypertension. We
assess the learning and inferential utility of the synthetic data by comparing
the accuracy in the prediction of the hypertension status and variational
posterior distribution of the parameters of a physics-based model. In addition,
we use a simulated dataset from a nonlinear model to compare the results from
variational inference (VI) based on privacy-preserving synthetic data, and
privacy-preserving VI obtained from directly privatizing NFs for VI with DP
guarantees given the original non-private dataset. The results suggest that
synthetic data generated through differentially private density estimation with
NF can yield good utility at a reasonable privacy cost. We also show that VI
obtained from differentially private NF based on the free energy bound loss may
produce variational approximations with significantly altered correlation
structure, and loss formulations based on alternative dissimilarity metrics
between two distributions might provide improved results.
( 2
min )
In a recent paper, Ling et al. investigated the over-parametrized Deep
Equilibrium Model (DEQ) with ReLU activation and proved that the gradient
descent converges to a globally optimal solution at a linear convergence rate
for the quadratic loss function. In this paper, we show that this fact still
holds for DEQs with any general activation which has bounded first and second
derivatives. Since the new activation function is generally non-linear, a
general population Gram matrix is designed, and a new form of dual activation
with Hermite polynomial expansion is developed.
( 2
min )
We propose a new bound for generalization of neural networks using Koopman
operators. Unlike most of the existing works, we focus on the role of the final
nonlinear transformation of the networks. Our bound is described by the
reciprocal of the determinant of the weight matrices and is tighter than
existing norm-based bounds when the weight matrices do not have small singular
values. According to existing theories about the low-rankness of the weight
matrices, it may be counter-intuitive that we focus on the case where singular
values of weight matrices are not small. However, motivated by the final
nonlinear transformation, we can see that our result sheds light on a new
perspective regarding a noise filtering property of neural networks. Since our
bound comes from Koopman operators, this work also provides a connection
between operator-theoretic analysis and generalization of neural networks.
Numerical results support the validity of our theoretical results.
( 2
min )
Here is a podcast episode with Noam Brown from Meta AI where we discuss his work on achieving human-level performance on poker and Diplomacy, as well as the power of spending compute at inference time!
submitted by /u/thejashGI
[link] [comments]
( 42
min )
I'm glad to share with you our Open Access survey paper about image super-resolution:
https://ieeexplore.ieee.org/abstract/document/10041995
The goal of this work is to give an overview of the abundance of publications in image super-resolution, give an introduction for new researchers, and open thriving discussions as well as point to potential future directions to advance the field :)
submitted by /u/Maleficent_Stay_7737
[link] [comments]
( 43
min )
Here is a podcast episode with Noam Brown from Meta AI where we discuss his work on achieving human-level performance on poker and Diplomacy, as well as the power of spending compute at inference time!
submitted by /u/thejashGI
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/ssigea
[link] [comments]
( 43
min )
submitted by /u/Ranwell13
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/spacesluts
[link] [comments]
( 40
min )
submitted by /u/Dalembert
[link] [comments]
( 43
min )
submitted by /u/JimZerChapirov
[link] [comments]
( 41
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/Impressive_Hat9961
[link] [comments]
( 40
min )
Amazon Kendra is an intelligent search service powered by machine learning (ML). It indexes the documents stored in a wide range of repositories and finds the most relevant document based on the keywords or natural language questions the user has searched for. In some scenarios, you need the search results to be filtered based on […]
( 12
min )
We’re excited to announce that Amazon Personalize now lets you measure how your personalized recommendations can help you achieve your business goals. After specifying the metrics that you want to track, you can identify which campaigns and recommenders are most impactful and understand the impact of recommendations on your business metrics. All customers want to […]
( 10
min )
Love and creativity are in the air this Valentine’s Day In the NVIDIA Studio, as 3D artist Molly Brady presents a parody scene inspired by the iconic The Birth of Venus (Redux) painting by Sando Botticelli.
( 7
min )
submitted by /u/Piano-Nerd
[link] [comments]
( 40
min )
submitted by /u/trcytony
[link] [comments]
( 40
min )
submitted by /u/No-Factor2579
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Chisom1998_
[link] [comments]
( 40
min )
submitted by /u/ChaosMindsDev
[link] [comments]
( 40
min )
submitted by /u/TheInsaneApp
[link] [comments]
( 40
min )
submitted by /u/tipani86
[link] [comments]
( 40
min )
submitted by /u/henshinger
[link] [comments]
( 41
min )
submitted by /u/Phishstixxx
[link] [comments]
( 40
min )
submitted by /u/r4pturesan
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/chronck
[link] [comments]
( 40
min )
submitted by /u/Alarming-Recipe2857
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/tysam_and_co
[link] [comments]
( 47
min )
How a single SYCL codebase makes it possible to run multi-devices such as Intel GPUs, AMD GPUs, and NVIDIA GPUs Posted on behalf of Arti Gupta, Intel oneAPI Program Director The ever-growing scale and speed of High-Performance Computing (HPC) systems unleash many new opportunities for researchers and data scientists. Today, the first exascale-capable HPC systems,… Read More »Advancing HPC and AI through oneAPI Heterogeneous Programming in Academia and Research
The post Advancing HPC and AI through oneAPI Heterogeneous Programming in Academia and Research appeared first on Data Science Central.
( 20
min )
The world is going digital at a very fast speed. From retail shops to the cab industry to banking, all are changing and so is the healthcare industry. We can see a huge difference in the industry in terms of technology compared to ten years back. But there is a long way to go for… Read More »Top Healthcare App Development Trends That Will Dominate 2023
The post Top Healthcare App Development Trends That Will Dominate 2023 appeared first on Data Science Central.
( 22
min )
There’s no denying that we live in an app-driven world, and that’s especially true for modern businesses. Organizations use apps for almost everything. While this allows for faster communication, it can also lead to application fragmentation. App fragmentation is when an organization uses multiple applications to perform similar tasks. This creates an inefficient and disjointed… Read More »App Fragmentation & How To Avoid Siloed Communication: 3 Right Technologies for The Job
The post App Fragmentation & How To Avoid Siloed Communication: 3 Right Technologies for The Job appeared first on Data Science Central.
( 22
min )
So I just uploaded a devlog out about my bullet-dodging AI game. I discuss how I trained a Reinforcement Learning agent to learn to dodge bullets using Unity's ML Agents package! The goal of the next devlog is to extend this to a 2 player setting, where a human player competes against a trained AI player to dodge/shoot bullets! I will probably be doing some MARL with self-play to achieve this, but this video is a single-agent setting.
I'm a baby Youtuber, so I appreciate yall for checking it out!
https://youtu.be/l9geEcn-A6Q
submitted by /u/AvvYaa
[link] [comments]
( 41
min )
This post is co-written by Zdenko Estok, Cloud Architect at Accenture and Sakar Selimcan, DeepRacer SME at Accenture. With the increasing use of artificial intelligence (AI) and machine learning (ML) for a vast majority of industries (ranging from healthcare to insurance, from manufacturing to marketing), the primary focus shifts to efficiency when building and training […]
( 8
min )
The method enables a model to determine its confidence in a prediction, while using no additional data and far fewer computing resources than other methods.
( 9
min )
submitted by /u/radi-cho
[link] [comments]
( 42
min )
submitted by /u/radi-cho
[link] [comments]
( 42
min )
submitted by /u/radi-cho
[link] [comments]
( 42
min )
submitted by /u/Thebombdiggityy
[link] [comments]
( 42
min )
submitted by /u/Wiskkey
[link] [comments]
( 43
min )
submitted by /u/t0ns0fph0t0ns
[link] [comments]
( 44
min )
submitted by /u/helliun
[link] [comments]
( 45
min )
Using available off-the-shelf AI services, I ended up making this video. I walk through the process and discuss some implications.
Here is the process that I followed
Asked ChatGPT to create a script
Asked a text-to-speech generative AI to convert the script into an audio
Asked MidJourney to create an Avatar of a narrator
Ask audio-to-video generative AI to generate video from the avatar and audio.
https://ithinkbot.com/make-end-to-end-video-using-generative-ai-totally-free-try-it-out-dadee18302de
submitted by /u/Opitmus_Prime
[link] [comments]
( 41
min )
submitted by /u/ProglabHelper
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Thebombdiggityy
[link] [comments]
( 40
min )
submitted by /u/XyBr_ez
[link] [comments]
( 40
min )
submitted by /u/Peter3tv33
[link] [comments]
( 41
min )
submitted by /u/joeyjojo6161
[link] [comments]
( 40
min )
Hi guys,
I have made a video on YouTube here where I explain how we can measure the fairness of a machine learning model by using the disparate impact score.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 41
min )
submitted by /u/YungMixtape2004
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
Hi, I have just come across AI course provided by OpenCV. It has a lot about computer vision stuff. But it costs $1599, anyone is taking it? any comment? Should I bet on this for a career change?
P.S. I have some basic programming knowledge and engineering background.
Here is the link to their course page.
https://opencv.org/courses/
submitted by /u/sumofjack
[link] [comments]
( 41
min )
submitted by /u/ssigea
[link] [comments]
( 46
min )
submitted by /u/SpatialComputing
[link] [comments]
( 41
min )
submitted by /u/karrnawhore
[link] [comments]
( 41
min )
Hi guys,
I have made a video on YouTube here where I explain how we can measure the fairness of a machine learning model by using the disparate impact score.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 41
min )
submitted by /u/Lakshmireddys
[link] [comments]
( 40
min )
submitted by /u/karrnawhore
[link] [comments]
( 40
min )
If you want any more proof about how much AI has integrated itself into our daily lives, go no further than the map on your smart phone. Whether you use Google Maps or Apple Maps or Waze (also owned by Google), these AI-infused apps are amazing at getting you from Point A to Point B… Read More »AI Effectiveness Starts by Understanding User Intent
The post AI Effectiveness Starts by Understanding User Intent appeared first on Data Science Central.
( 22
min )
Incident management for cloud services is a complex process involving several
steps and has a huge impact on both service health and developer productivity.
On-call engineers require significant amount of domain knowledge and manual
effort for root causing and mitigation of production incidents. Recent advances
in artificial intelligence has resulted in state-of-the-art large language
models like GPT-3.x (both GPT-3.0 and GPT-3.5), which have been used to solve a
variety of problems ranging from question answering to text summarization. In
this work, we do the first large-scale study to evaluate the effectiveness of
these models for helping engineers root cause and mitigate production
incidents. We do a rigorous study at Microsoft, on more than 40,000 incidents
and compare several large language models in zero-shot, fine-tuned and
multi-task setting using semantic and lexical metrics. Lastly, our human
evaluation with actual incident owners show the efficacy and future potential
of using artificial intelligence for resolving cloud incidents.
( 2
min )
submitted by /u/karrnawhore
[link] [comments]
( 42
min )
submitted by /u/TheRealBrisky
[link] [comments]
( 42
min )
submitted by /u/norcalnatv
[link] [comments]
( 44
min )
submitted by /u/dvilasuero
[link] [comments]
( 42
min )
submitted by /u/erwinyonata
[link] [comments]
( 42
min )
submitted by /u/_sshin_
[link] [comments]
( 45
min )
submitted by /u/iFighting
[link] [comments]
( 43
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
....Instead, it's introducing a new way for people to access the same information. One which can put a major dent in its market share (it’s almost 85% right now).
And Satya says he's willing to accept a "decrease in margins" of the Search business.
https://www.thestatuscode.co/p/the-ultimate-guide-to-the-ai-war
submitted by /u/pyactee
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 45
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/fiachaire27
[link] [comments]
( 40
min )
submitted by /u/Taiva
[link] [comments]
( 40
min )
submitted by /u/LeafsterVR
[link] [comments]
( 40
min )
submitted by /u/pengzhenghao
[link] [comments]
( 41
min )
submitted by /u/jromero12345678910
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
I really want to play with the repo but I'm stuck at the last step of the instructions (https://github.com/lucidrains/musiclm-pytorch#usage-1). If anyone has tips, please let me know!
Here's the issue I have: https://github.com/lucidrains/musiclm-pytorch/issues/13
submitted by /u/BackgroundPass2082
[link] [comments]
( 42
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/hakJav
[link] [comments]
( 41
min )
submitted by /u/vfra32
[link] [comments]
( 41
min )
submitted by /u/the_ferryman_abides
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/RushingRobotics_com
[link] [comments]
( 40
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 40
min )
submitted by /u/benbyford
[link] [comments]
( 40
min )
submitted by /u/Legal-Ad-1650
[link] [comments]
( 41
min )
submitted by /u/SpawnOfCthun
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/howardpinsky
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/riiswa
[link] [comments]
( 40
min )
MIT spinout Verta offers tools to help companies introduce, monitor, and manage machine-learning models safely and at scale.
( 10
min )
This post is co-written with Jonathan Jung, Mike Band, Michael Chi, and Thompson Bliss at the National Football League. A coverage scheme refers to the rules and responsibilities of each football defender tasked with stopping an offensive pass. It is at the core of understanding and analyzing any football defensive strategy. Classifying the coverage scheme […]
( 14
min )
The metaverse, a term popularised by science fiction, refers to a shared virtual space where users can interact with each other in a virtual environment. It’s a convergence of real and virtual worlds, creating a new reality that exists simultaneously with the physical world. With the rapid advancement of technology, particularly in the field of… Read More »Metaverse Development: Building the Future of Virtual Reality
The post Metaverse Development: Building the Future of Virtual Reality appeared first on Data Science Central.
( 20
min )
submitted by /u/ziroxonline
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 40
min )
submitted by /u/red3vil96
[link] [comments]
( 41
min )
The following guide provides an independent review of how well this OpenAI detection software performs and how its capabilities stack up against competitors (for finding A!-generated text and plagiarism) OpenAI Text Classifier: ChatGPT’s Own AI Detection - Review
submitted by /u/thumbsdrivesmecrazy
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 44
min )
submitted by /u/sifeliz
[link] [comments]
( 40
min )
submitted by /u/Ok-Craft-9908
[link] [comments]
( 41
min )
submitted by /u/okanaganjournal
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/AdministrativeLet996
[link] [comments]
( 41
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 43
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/MindCluster
[link] [comments]
( 40
min )
submitted by /u/victorsevero
[link] [comments]
( 44
min )
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from any document or image. AnalyzeDocument Signatures is a feature within Amazon Textract that offers the ability to automatically detect signatures on any document. This can reduce the need for human review, custom code, or ML experience. In this post, […]
( 7
min )
Earth’s changing climate poses an increased risk of drought due to global warming. Since 1880, the global temperature has increased 1.01 °C. Since 1993, sea levels have risen 102.5 millimeters. Since 2002, the land ice sheets in Antarctica have been losing mass at a rate of 151.0 billion metric tons per year. In 2022, the […]
( 10
min )
The chatbot’s success on the medical licensing exam shows that the test — and medical education — are flawed, Celi says.
( 8
min )
Would like to hear about what you guys think about this approach?
submitted by /u/ThePerson654321
[link] [comments]
( 43
min )
submitted by /u/nickb
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
Electric automaker XPENG’s flagship G9 SUV and P7 sports sedan are now available for order in Sweden, Denmark, Norway and the Netherlands — an expansion revealed last week at the eCar Expo in Stockholm. The intelligent electric vehicles are built on the high-performance NVIDIA DRIVE Orin centralized compute architecture and deliver AI capabilities that are Read article >
( 5
min )
Designing automotive visualizations can be incredibly time consuming. To make the renders look as realistic as possible, artists need to consider material textures, paints, realistic lighting and reflections, and more. For 3D artist David Baylis, it’s important to include these details and still create high-resolution renders in a short amount of time. That’s why he Read article >
( 6
min )
Venture to the Forgotten Realms this GFN Thursday in Baldur’s Gate 3, streaming on GeForce NOW. Celebrations for the cloud gaming service’s third anniversary continue with a Dying Light 2 reward that’s to die for. It’s the cherry on top of three new titles joining the GeForce NOW library this week. Roll for Initiative Mysterious Read article >
( 5
min )
submitted by /u/Lakshmireddys
[link] [comments]
( 40
min )
submitted by /u/keghn
[link] [comments]
( 40
min )
submitted by /u/joemurray1994
[link] [comments]
( 41
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Lakshmireddys
[link] [comments]
( 40
min )
submitted by /u/henlo_there_fren
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/joemurray1994
[link] [comments]
( 40
min )
submitted by /u/Fantomas77
[link] [comments]
( 40
min )
submitted by /u/derstarkerwille
[link] [comments]
( 41
min )
submitted by /u/theindianappguy
[link] [comments]
( 41
min )
submitted by /u/nickkgar
[link] [comments]
( 40
min )
Machine learning (ML) has become ubiquitous. Our customers are employing ML in every aspect of their business, including the products and services they build, and for drawing insights about their customers. To build an ML-based application, you have to first build the ML model that serves your business requirement. Building ML models involves preparing the […]
( 15
min )
The first NVIDIA Studio laptops powered by GeForce RTX 40 Series Laptop GPUs are now available, starting with systems from MSI and Razer — with many more to come.
( 8
min )
Critical applications, such as in the medical field, require the rapid
provision of additional information to interpret decisions made by deep
learning methods. In this work, we propose a fast and accurate method to
visualize activations of classification and semantic segmentation networks by
stitching them with a GAN generator utilizing convolutions. We test our
approach on images of animals from the AFHQ wild dataset and real-world digital
pathology scans of stained tissue samples. Our method provides comparable
results to established gradient descent methods on these datasets while running
about two orders of magnitude faster.
( 2
min )
We study online Reinforcement Learning (RL) in non-stationary input-driven
environments, where a time-varying exogenous input process affects the
environment dynamics. Online RL is challenging in such environments due to
catastrophic forgetting (CF). The agent tends to forget prior knowledge as it
trains on new experiences. Prior approaches to mitigate this issue assume task
labels (which are often not available in practice) or use off-policy methods
that can suffer from instability and poor performance.
We present Locally Constrained Policy Optimization (LCPO), an on-policy RL
approach that combats CF by anchoring policy outputs on old experiences while
optimizing the return on current experiences. To perform this anchoring, LCPO
locally constrains policy optimization using samples from experiences that lie
outside of the current input distribution. We evaluate LCPO in two gym and
computer systems environments with a variety of synthetic and real input
traces, and find that it outperforms state-of-the-art on-policy and off-policy
RL methods in the online setting, while achieving results on-par with an
offline agent pre-trained on the whole input trace.
( 2
min )
Bilevel optimization has been developed for many machine learning tasks with
large-scale and high-dimensional data. This paper considers a constrained
bilevel optimization problem, where the lower-level optimization problem is
convex with equality and inequality constraints and the upper-level
optimization problem is non-convex. The overall objective function is
non-convex and non-differentiable. To solve the problem, we develop a
gradient-based approach, called gradient approximation method, which determines
the descent direction by computing several representative gradients of the
objective function inside a neighborhood of the current estimate. We show that
the algorithm asymptotically converges to the set of Clarke stationary points,
and demonstrate the efficacy of the algorithm by the experiments on
hyperparameter optimization and meta-learning.
( 2
min )
Contrary to its original interpretation as a facilitator of knowledge
transfer from one model to another, some recent studies have suggested that
knowledge distillation (KD) is instead a form of regularization. Perhaps the
strongest support of all for this claim is found in its apparent similarities
with label smoothing (LS). This paper investigates the stated equivalence of
these two methods by examining the predictive uncertainties of the models they
train. Experiments on four text classification tasks involving teachers and
students of different capacities show that: (a) In most settings, KD and LS
drive model uncertainty (entropy) in completely opposite directions, and (b) In
KD, the student's predictive uncertainty is a direct function of that of its
teacher, reinforcing the knowledge transfer view.
( 2
min )
This work investigates the intersection of cross modal learning and semi
supervised learning, where we aim to improve the supervised learning
performance of the primary modality by borrowing missing information from an
unlabeled modality. We investigate this problem from a Nadaraya Watson (NW)
kernel regression perspective and show that this formulation implicitly leads
to a kernelized cross attention module. To this end, we propose The Attention
Patch (TAP), a simple neural network plugin that allows data level knowledge
transfer from the unlabeled modality. We provide numerical simulations on three
real world datasets to examine each aspect of TAP and show that a TAP
integration in a neural network can improve generalization performance using
the unlabeled modality.
( 2
min )
There has been much recent progress in forecasting the next observation of a
linear dynamical system (LDS), which is known as the improper learning, as well
as in the estimation of its system matrices, which is known as the proper
learning of LDS. We present an approach to proper learning of LDS, which in
spite of the non-convexity of the problem, guarantees global convergence of
numerical solutions to a least-squares estimator. We present promising
computational results.
( 2
min )
This work investigates the intersection of cross modal learning and semi
supervised learning, where we aim to improve the supervised learning
performance of the primary modality by borrowing missing information from an
unlabeled modality. We investigate this problem from a Nadaraya Watson (NW)
kernel regression perspective and show that this formulation implicitly leads
to a kernelized cross attention module. To this end, we propose The Attention
Patch (TAP), a simple neural network plugin that allows data level knowledge
transfer from the unlabeled modality. We provide numerical simulations on three
real world datasets to examine each aspect of TAP and show that a TAP
integration in a neural network can improve generalization performance using
the unlabeled modality.
( 2
min )
We present a variety of novel information-theoretic generalization bounds for
learning algorithms, from the supersample setting of Steinke & Zakynthinou
(2020)-the setting of the "conditional mutual information" framework. Our
development exploits projecting the loss pair (obtained from a training
instance and a testing instance) down to a single number and correlating loss
values with a Rademacher sequence (and its shifted variants). The presented
bounds include square-root bounds, fast-rate bounds, including those based on
variance and sharpness, and bounds for interpolating algorithms etc. We show
theoretically or empirically that these bounds are tighter than all
information-theoretic bounds known to date on the same supersample setting.
( 2
min )
Message Passing Neural Networks (MPNNs) are instances of Graph Neural
Networks that leverage the graph to send messages over the edges. This
inductive bias leads to a phenomenon known as over-squashing, where a node
feature is insensitive to information contained at distant nodes. Despite
recent methods introduced to mitigate this issue, an understanding of the
causes for over-squashing and of possible solutions are lacking. In this
theoretical work, we prove that: (i) Neural network width can mitigate
over-squashing, but at the cost of making the whole network more sensitive;
(ii) Conversely, depth cannot help mitigate over-squashing: increasing the
number of layers leads to over-squashing being dominated by vanishing
gradients; (iii) The graph topology plays the greatest role, since
over-squashing occurs between nodes at high commute (access) time. Our analysis
provides a unified framework to study different recent methods introduced to
cope with over-squashing and serves as a justification for a class of methods
that fall under `graph rewiring'.
( 2
min )
This work studies the pure-exploration setting for the convex hull
feasibility (CHF) problem where one aims to efficiently and accurately
determine if a given point lies in the convex hull of means of a finite set of
distributions. We give a complete characterization of the sample complexity of
the CHF problem in the one-dimensional setting. We present the first
asymptotically optimal algorithm called Thompson-CHF, whose modular design
consists of a stopping rule and a sampling rule. In addition, we provide an
extension of the algorithm that generalizes several important problems in the
multi-armed bandit literature. Finally, we further investigate the Gaussian
bandit case with unknown variances and address how the Thompson-CHF algorithm
can be adjusted to be asymptotically optimal in this setting.
( 2
min )
The recipe behind the success of deep learning has been the combination of
neural networks and gradient-based optimization. Understanding the behavior of
gradient descent however, and particularly its instability, has lagged behind
its empirical success. To add to the theoretical tools available to study
gradient descent we propose the principal flow (PF), a continuous time flow
that approximates gradient descent dynamics. To our knowledge, the PF is the
only continuous flow that captures the divergent and oscillatory behaviors of
gradient descent, including escaping local minima and saddle points. Through
its dependence on the eigendecomposition of the Hessian the PF sheds light on
the recently observed edge of stability phenomena in deep learning. Using our
new understanding of instability we propose a learning rate adaptation method
which enables us to control the trade-off between training stability and test
set evaluation performance.
( 2
min )
https://www.theverge.com/2023/2/7/23587454/microsoft-bing-edge-chatgpt-ai
submitted by /u/currentscurrents
[link] [comments]
( 44
min )
From Article:
Getty Images new lawsuit claims that Stability AI, the company behind Stable Diffusion's AI image generator, stole 12 million Getty images with their captions, metadata, and copyrights "without permission" to "train its Stable Diffusion algorithm."
The company has asked the court to order Stability AI to remove violating images from its website and pay $150,000 for each.
However, it would be difficult to prove all the violations. Getty submitted over 7,000 images, metadata, and copyright registration, used by Stable Diffusion.
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 49
min )
📢 News 📢
Pythae 0.1.0 is now out and supports distributed training using PyTorch DDP !
Train your favorite Variational Autoencoders (VAEs) faster 🏎️ and on larger datasets, still with a few lines of code 🖥️.
👉github: https://github.com/clementchadebec/benchmark_VAE
👉pypi: https://pypi.org/project/pythae/
https://preview.redd.it/jk4ukkgarpga1.png?width=1335&format=png&auto=webp&s=07c1ab2eaad104879637ad04472935d87baa31e9
submitted by /u/cchad-8
[link] [comments]
( 43
min )
Hey guys, I’m the co-founder of a tech startup focused on providing free AI services. We’re one of the first mobile multipurpose AI apps.
We’ve developed a pretty cool app that offers AI services like image generation, code generation, image captioning, and more for free. We’re sort of like a Swiss Army knife of generative and analytical AI.
We’ve released a new feature called AAIA (Ask AI Anything), which is capable of answering all types of questions, even requests to generate literature, story-lines, jokes, general information, etc.
We’d love to have some people try it out, give us feedback, and keep in touch with us.
https://apps.apple.com/us/app/bright-eye/id1593932475
submitted by /u/BrightEyeuser
[link] [comments]
( 41
min )
submitted by /u/bukowski3000
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/citizentim
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/Mogen1000
[link] [comments]
( 42
min )
submitted by /u/ai-lover
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/pmigdal
[link] [comments]
( 42
min )
submitted by /u/pentin0
[link] [comments]
( 40
min )
submitted by /u/nowadayswow
[link] [comments]
( 40
min )
https://medium.com/seeds-for-the-future/the-next-step-for-generative-ai-830112890d04?sk=1d6b4c96cc6cb0a4690bcf9df0d12bcc
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/Esportage
[link] [comments]
( 39
min )
submitted by /u/CoolkidRR
[link] [comments]
( 41
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/quanik_314
[link] [comments]
( 40
min )
submitted by /u/xWh0am1
[link] [comments]
( 41
min )
submitted by /u/Historical-Pen9653
[link] [comments]
( 41
min )
This post is co-written with Stephen Aylward, Matt McCormick, Brianna Major from Kitware and Justin Kirby from the Frederick National Laboratory for Cancer Research (FNLCR). Amazon SageMaker Studio Lab provides no-cost access to a machine learning (ML) development environment to everyone with an email address. Like the fully featured Amazon SageMaker Studio, Studio Lab allows […]
( 8
min )
Amazon SageMaker has announced the support of three new completion criteria for Amazon SageMaker automatic model tuning, providing you with an additional set of levers to control the stopping criteria of the tuning job when finding the best hyperparameter configuration for your model. In this post, we discuss these new completion criteria, when to use them, and […]
( 8
min )
AI Weirdness: the strange side of machine learning
( 2
min )
Announcements Machine Learning Controversy: From No-Code to No-Math One controversial topic in machine learning circles is code versus no-code. Can you be a real data scientist if you don’t code? Of course you can: You may be leveraging platforms and the code is one or two layers below the responsibilities of your job. Maybe you… Read More »DSC Weekly 7 February 2023 – Machine Learning Controversy: From No-Code to No-Math
The post DSC Weekly 7 February 2023 – Machine Learning Controversy: From No-Code to No-Math appeared first on Data Science Central.
( 21
min )
Data labeling and/or data annotation has long been a critical component of many machine learning and AI initiatives. In recent years, the demand for accurate and reliable data labeling has risen dramatically as the process becomes increasingly vital to the success of numerous projects. But what is data labeling exactly? Data Labeling 2023 – how… Read More »The Impact of Data Labeling 2023: Current Trends & Future Demands
The post The Impact of Data Labeling 2023: Current Trends & Future Demands appeared first on Data Science Central.
( 22
min )
Mobile Apps to Develop Your Data Science Skills -Mobile phones are the most preferred medium of accomplishing minute-to-minutest tasks on a daily basis. We don’t need to visit any particular restaurant to take away the food, we can do this by just sitting on our favorite couch at home, thanks to food ordering apps. Not… Read More »Best 9 Mobile Apps to Develop Your Data Science Skills in 2023
The post Best 9 Mobile Apps to Develop Your Data Science Skills in 2023 appeared first on Data Science Central.
( 23
min )
Doctors rarely make diagnoses based on a single factor — they look at a mix of data types, such as a patient’s symptoms, laboratory and radiology reports, and medical history. VinBrain, a Vietnam-based health-tech startup, is ensuring that AI diagnostics can take a similarly holistic view across vital signs, blood tests, medical images and more. Read article >
( 6
min )
submitted by /u/trcytony
[link] [comments]
( 40
min )
submitted by /u/barrese87
[link] [comments]
( 40
min )
submitted by /u/barrese87
[link] [comments]
( 40
min )
submitted by /u/VR_Angel
[link] [comments]
( 41
min )
submitted by /u/VR_Angel
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Peaking_AI
[link] [comments]
( 40
min )
submitted by /u/shauryadevil
[link] [comments]
( 40
min )
submitted by /u/Zirius_Sadfaces
[link] [comments]
( 40
min )
submitted by /u/magenta_placenta
[link] [comments]
( 40
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/AR_MR_XR
[link] [comments]
( 40
min )
submitted by /u/TheDotnetoffice
[link] [comments]
( 40
min )
submitted by /u/nikesh96
[link] [comments]
( 41
min )
AI Seinfeld Transphobic rant - YouTube
submitted by /u/Status_Signal_4083
[link] [comments]
( 42
min )
submitted by /u/johnGettings
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 41
min )
submitted by /u/DarronFeldstein
[link] [comments]
( 40
min )
submitted by /u/ImplodingCoding
[link] [comments]
( 44
min )
I have made a Stack Overflow post here. I will highly appreciate all your help on this. Thank you!
submitted by /u/Academic-Rent7800
[link] [comments]
( 42
min )
It took me about 46 hours to run this on my 3080 at home. The original files was from the Blu-ray release that was unfortunately pretty poorly done in my opinion. This version really gives it new life I think.
Here's a link to the video result to see for yourself:
https://vimeo.com/796411232
And a link to the model I used!
https://github.com/TencentARC/AnimeSR
submitted by /u/VR_Angel
[link] [comments]
( 43
min )
https://blog.google/technology/ai/bard-google-ai-search-updates/
submitted by /u/EducationalCicada
[link] [comments]
( 50
min )
From the article:
Getty Images has filed a lawsuit in the US against Stability AI, creators of open-source AI art generator Stable Diffusion, escalating its legal battle against the firm.
The stock photography company is accusing Stability AI of “brazen infringement of Getty Images’ intellectual property on a staggering scale.” It claims that Stability AI copied more than 12 million images from its database “without permission ... or compensation ... as part of its efforts to build a competing business,” and that the startup has infringed on both the company’s copyright and trademark protections.
This is different from the UK-based news from weeks ago.
submitted by /u/Wiskkey
[link] [comments]
( 44
min )
I made an image captioning and clustering tools for computer vision and diffusion projects.
You can run almost everything automatically and with a simple CLI command. All contributions are welcome.
https://github.com/cobanov/image-clustering
https://github.com/cobanov/image-captioning
submitted by /u/metover
[link] [comments]
( 42
min )
submitted by /u/ImplodingCoding
[link] [comments]
( 43
min )
submitted by /u/t0ns0fph0t0ns
[link] [comments]
( 44
min )
submitted by /u/imagoons
[link] [comments]
( 42
min )
A new tool brings the benefits of AI programming to a much broader class of problems.
( 8
min )
This blog post is co-written with Bruno Mateus, Jonathan Diedrich and Crispim Tribuna at Talkdesk. Contact centers are using artificial intelligence (AI) and natural language processing (NLP) technologies to build a personalized customer experience and deliver effective self-service support through conversational bots. This is the first of a two-part series dedicated to the integration of […]
( 8
min )
Researchers continue to develop new model architectures for common machine learning (ML) tasks. One such task is image classification, where images are accepted as input and the model attempts to classify the image as a whole with object label outputs. With many models available today that perform this image classification task, an ML practitioner may […]
( 11
min )
“I’ll tell you the problem with the scientific power that you’re using here: it didn’t require any discipline to attain it. You read what others had done and you took the next step. You didn’t earn the knowledge for yourselves, so you don’t take any responsibility for it. You stood on the shoulders of geniuses… Read More »It’s No Big Deal, but ChatGPT Changes Everything – Part III
The post It’s No Big Deal, but ChatGPT Changes Everything – Part III appeared first on Data Science Central.
( 24
min )
Just a few days ago, January 28, we celebrated Data Protection Day, an international event aimed at promoting data privacy and security. In line with the goal of raising awareness about data protection, it would be a good time to discuss data security with Realtime Operating System. This unconventional operating system is widely used, so… Read More »Ensuring Data Security in Realtime Operating System (RTOS) Devices
The post Ensuring Data Security in Realtime Operating System (RTOS) Devices appeared first on Data Science Central.
( 21
min )
A University of Toronto undergrad among an international team of researchers unleashing deep learning in the search for extraterrestrial civilizations.
( 6
min )
submitted by /u/Illustrious_Row_9971
[link] [comments]
( 42
min )
submitted by /u/DenofBlerds
[link] [comments]
( 42
min )
submitted by /u/WarmFormal9881
[link] [comments]
( 42
min )
submitted by /u/jsonathan
[link] [comments]
( 46
min )
Tweet thread: https://twitter.com/WholeMarsBlog/status/1622139178439036928
First impressions: this sucks ass I can only ask about dogs and a few different types of prompts
Does anyone else have experiences to share with this nerfed LaMDA beta google released?
submitted by /u/That_Violinist_18
[link] [comments]
( 44
min )
submitted by /u/Illustrious_Row_9971
[link] [comments]
( 42
min )
https://youtu.be/ktdUeqzzhiA what text to speech does he use? he's been popping up on my yt feed lately and i can see he has different voices in his videos and most of them sound robotic, what do you think it's being used here?
submitted by /u/candidhorse4
[link] [comments]
( 42
min )
submitted by /u/EIDANart
[link] [comments]
( 40
min )
submitted by /u/yikeshardware
[link] [comments]
( 42
min )
submitted by /u/IndependenceFun4627
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 41
min )
submitted by /u/foundersblock
[link] [comments]
( 40
min )
How can we move from an idea to production in AI?
Does the technology readiness levels (TRL) help?
If you want to get some answers please read this article in medium:
https://medium.com/towards-artificial-intelligence/technology-readiness-levels-trl-in-ai-development-c6ed1190fbd6
All the ideas are more than welcome!
submitted by /u/Nice-Tomorrow2926
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 42
min )
submitted by /u/barrese87
[link] [comments]
( 40
min )
Hi all,
For my weekend project I figured I would build an AI driven spiritual successor to Mystery Science Theater 3000... Stop on by and watch the AI characters watch movies and make comments!
Today they are watching "The House on Haunted Hill" and "Plan 9 From Outer Space."
There's still a lot to do but I'm excited to play around with this more and see how it plays out and would love some feedback!
https://twitch.tv/MysteryAItheater
submitted by /u/caseigl
[link] [comments]
( 42
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 54
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/shani_786
[link] [comments]
( 41
min )
submitted by /u/insaneintheblain
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/LincolnOsiris_
[link] [comments]
( 40
min )
submitted by /u/Flaky_Preparation_50
[link] [comments]
( 40
min )
submitted by /u/ai-lover
[link] [comments]
( 41
min )
submitted by /u/visimens-technology
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 41
min )
https://www.udemy.com/course/chatgpt-bot/?couponCode=5-DAYS-FREE
Hey everyone, I recently made a course about ChatGPT as a fun passion project. This is for anyone who wants to learn how to create automated workflows (using Chrome extensions) with ChatGPT. Specifically, you will create a ChatGPT bot that automatically answers your emails. It is beginner friendly and includes getting some good practice with JavaScript. I hope you enjoy it and I'm looking forward to your feedback/questions :)
submitted by /u/neuromodel
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
https://www.youtube.com/watch?v=8TOgN-U0ask&t=1s
After the Lensa AI controversy led many people to question whether AI really is creative or is it just "remixing" other artists' copyrighted work used with permission, it has led many to wonder whether AI trained on copyrighted images should be illegal. This talk makes some interesting comparisons which might just mean the answer is no.
submitted by /u/BearNo21
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 40
min )
submitted by /u/madskills42001
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/keghn
[link] [comments]
( 40
min )
submitted by /u/radi-cho
[link] [comments]
( 44
min )
submitted by /u/adamnemecek
[link] [comments]
( 42
min )
submitted by /u/errorr_unknown
[link] [comments]
( 42
min )
submitted by /u/EmbarrassedHelp
[link] [comments]
( 43
min )
submitted by /u/MysteryInc152
[link] [comments]
( 42
min )
submitted by /u/adt
[link] [comments]
( 42
min )
From the Financial Times: https://www.ft.com/content/583ead66-467c-4bd5-84d0-ed5df7b5bf9c
Unpaywalled: https://archive.is/ciZPV
I guess I'm a little surprised, this feels like Google backing a competitor to 1) their own Google Brain teams, and 2) Deepmind. The cynical take might be that they're trying to lock in Anthropic; the same way Microsoft locked in OpenAI.
submitted by /u/bikeskata
[link] [comments]
( 47
min )
Github: https://github.com/google/vizier
Google AI Blog: https://ai.googleblog.com/2023/02/open-source-vizier-towards-reliable-and.html
Tweet from Zoubin Ghahramani: https://twitter.com/ZoubinGhahrama1/status/1621321675936768000?s=20&t=ZEuz9oSc_GWYxixtXDskqA
submitted by /u/enderlayer
[link] [comments]
( 43
min )
submitted by /u/HamletsLastLine
[link] [comments]
( 46
min )
In this article (https://dallasinnovates.com/exclusive-qa-john-carmacks-different-path-to-artificial-general-intelligence/) there is a quote from John Carmack that read: "I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today. "
My question is, what are these 40 papers?
submitted by /u/Gryphx
[link] [comments]
( 42
min )
Can someone please help with this question - https://ai.stackexchange.com/questions/39029/why-does-advantage-learning-help-function-approximators
submitted by /u/Academic-Rent7800
[link] [comments]
( 43
min )
This effort is focused on examining the behavior of reinforcement learning
systems in personalization environments and detailing the differences in policy
entropy associated with the type of learning algorithm utilized. We demonstrate
that Policy Optimization agents often possess low-entropy policies during
training, which in practice results in agents prioritizing certain actions and
avoiding others. Conversely, we also show that Q-Learning agents are far less
susceptible to such behavior and generally maintain high-entropy policies
throughout training, which is often preferable in real-world applications. We
provide a wide range of numerical experiments as well as theoretical
justification to show that these differences in entropy are due to the type of
learning being employed.
( 2
min )
Learning-based behavior prediction methods are increasingly being deployed in
real-world autonomous systems, e.g., in fleets of self-driving vehicles, which
are beginning to commercially operate in major cities across the world. Despite
their advancements, however, the vast majority of prediction systems are
specialized to a set of well-explored geographic regions or operational design
domains, complicating deployment to additional cities, countries, or
continents. Towards this end, we present a novel method for efficiently
adapting behavior prediction models to new environments. Our approach leverages
recent advances in meta-learning, specifically Bayesian regression, to augment
existing behavior prediction models with an adaptive layer that enables
efficient domain transfer via offline fine-tuning, online adaptation, or both.
Experiments across multiple real-world datasets demonstrate that our method can
efficiently adapt to a variety of unseen environments.
( 2
min )
The higher speed, scalability and parallelism offered by ReRAM crossbar
arrays foster development of ReRAM-based next generation AI accelerators. At
the same time, sensitivity of ReRAM to temperature variations decreases
R_on/Roff ratio and negatively affects the achieved accuracy and reliability of
the hardware. Various works on temperature-aware optimization and remapping in
ReRAM crossbar arrays reported up to 58\% improvement in accuracy and
2.39$\times$ ReRAM lifetime enhancement. This paper classifies the challenges
caused by thermal heat, starting from constraints in ReRAM cells' dimensions
and characteristics to their placement in the architecture. In addition, it
reviews available solutions designed to mitigate the impact of these
challenges, including emerging temperature-resilient DNN training methods. Our
work also provides a summary of the techniques and their advantages and
limitations.
( 2
min )
Hierarchical Clustering is a popular unsupervised machine learning method
with decades of history and numerous applications. We initiate the study of
differentially private approximation algorithms for hierarchical clustering
under the rigorous framework introduced by (Dasgupta, 2016). We show strong
lower bounds for the problem: that any $\epsilon$-DP algorithm must exhibit
$O(|V|^2/ \epsilon)$-additive error for an input dataset $V$. Then, we exhibit
a polynomial-time approximation algorithm with $O(|V|^{2.5}/
\epsilon)$-additive error, and an exponential-time algorithm that meets the
lower bound. To overcome the lower bound, we focus on the stochastic block
model, a popular model of graphs, and, with a separation assumption on the
blocks, propose a private $1+o(1)$ approximation algorithm which also recovers
the blocks exactly. Finally, we perform an empirical study of our algorithms
and validate their performance.
( 2
min )
Generative adversarial networks (GANs) have many application areas including
image editing, domain translation, missing data imputation, and support for
creative work. However, GANs are considered 'black boxes'. Specifically, the
end-users have little control over how to improve editing directions through
disentanglement. Prior work focused on new GAN architectures to disentangle
editing directions. Alternatively, we propose GANravel a user-driven direction
disentanglement tool that complements the existing GAN architectures and allows
users to improve editing directions iteratively. In two user studies with 16
participants each, GANravel users were able to disentangle directions and
outperformed the state-of-the-art direction discovery baselines in
disentanglement performance. In the second user study, GANravel was used in a
creative task of creating dog memes and was able to create high-quality edited
images and GIFs.
( 2
min )
Sparseness and robustness are two important properties for many machine
learning scenarios. In the present study, regarding the maximum correntropy
criterion (MCC) based robust regression algorithm, we investigate to integrate
the MCC method with the automatic relevance determination (ARD) technique in a
Bayesian framework, so that MCC-based robust regression could be implemented
with adaptive sparseness. To be specific, we use an inherent noise assumption
from the MCC to derive an explicit likelihood function, and realize the maximum
a posteriori (MAP) estimation with the ARD prior by variational Bayesian
inference. Compared to the existing robust and sparse L1-regularized MCC
regression, the proposed MCC-ARD regression can eradicate the troublesome
tuning for the regularization hyper-parameter which controls the regularization
strength. Further, MCC-ARD achieves superior prediction performance and feature
selection capability than L1-regularized MCC, as demonstrated by a noisy and
high-dimensional simulation study.
( 2
min )
We quantify the parameter stability of a spherical Gaussian Mixture Model
(sGMM) under small perturbations in distribution space. Namely, we derive the
first explicit bound to show that for a mixture of spherical Gaussian $P$
(sGMM) in a pre-defined model class, all other sGMM close to $P$ in this model
class in total variation distance has a small parameter distance to $P$.
Further, this upper bound only depends on $P$. The motivation for this work
lies in providing guarantees for fitting Gaussian mixtures; with this aim in
mind, all the constants involved are well defined and distribution free
conditions for fitting mixtures of spherical Gaussians. Our results tighten
considerably the existing computable bounds, and asymptotically match the known
sharp thresholds for this problem.
( 2
min )
submitted by /u/justLV
[link] [comments]
( 40
min )
submitted by /u/arnolds112
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/visimens-technology
[link] [comments]
( 40
min )
submitted by /u/HODLTID
[link] [comments]
( 40
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 41
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 40
min )
submitted by /u/SonntagMorgen
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 44
min )
submitted by /u/anekii
[link] [comments]
( 40
min )
submitted by /u/HooverHooverHoober
[link] [comments]
( 40
min )
Today, the NFL is continuing their journey to increase the number of statistics provided by the Next Gen Stats Platform to all 32 teams and fans alike. With advanced analytics derived from machine learning (ML), the NFL is creating new ways to quantify football, and to provide fans with the tools needed to increase their […]
( 10
min )
The National Football League (NFL) is one of the most popular sports leagues in the United States and is the most valuable sports league in the world. The NFL, BioCore, and AWS are committed to advancing human understanding around the diagnosis, prevention, and treatment of sports-related injuries to make the game of football safer. More […]
( 10
min )
I wanted to use the Learnable Trainangulation model in a commercial project. The source code itself is under MIT licensing. However, the dataset they have used is Human3.6M, which states that the license is "FREE OF CHARGE FOR ACADEMIC USE ONLY".
Yet, recent court rulings (in the US) state that models can use copyrighted data during training, and the results are no longer bound by that copyright (e.g. Google Books). Does the same apply here?
submitted by /u/mfarahmand98
[link] [comments]
( 42
min )
Cheers to another year of cloud gaming! GeForce NOW celebrates its third anniversary with a look at how far cloud gaming has come, a community celebration and 25 new games supported in February. Members can celebrate all month long, starting with a sweet Dying Light 2 reward and support for nine more games this week, Read article >
( 7
min )
NVIDIA A100 Tensor Core GPUs running on Supermicro servers have captured leading results for inference in the latest STAC-ML Markets benchmark, a key technology performance gauge for the financial services industry. The results show NVIDIA demonstrating unrivaled throughput — serving up thousands of inferences per second on the most demanding models — and top latency Read article >
( 6
min )
For several years, NVIDIA has been working with some of the world’s leading financial institutions to develop and execute a wide range of rapidly evolving AI strategies. For the past three years, we’ve asked them to tell us collectively what’s on the top of their minds. Sometimes the results are just what we thought they’d Read article >
( 6
min )
submitted by /u/Alyx1337
[link] [comments]
( 40
min )
https://www.axios.com/2023/02/01/chatgpt-subscriptions-chatbot-openai
Not fully paywalled, but there's a tiering system.
submitted by /u/bikeskata
[link] [comments]
( 42
min )
GitHub (sadly without weights). https://github.com/PetchMa/ML_GBT_SETI
News.
https://www-scinexx-de.translate.goog/news/kosmos/seti-findet-acht-potenzielle-alien-signale/?_x_tr_sl=de&_x_tr_tl=en&_x_tr_hl=de&_x_tr_pto=wapp
submitted by /u/logTom
[link] [comments]
( 44
min )
submitted by /u/rafs2006
[link] [comments]
( 41
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/ExperienceKCC
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/GlobeOpinion
[link] [comments]
( 40
min )
submitted by /u/Calatravo
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
This is laughable.They were sitting on all of the technology.And now they scramble to do something better than 10 links.I for myself will be disappointed with anything less than movie Her.
It's a high bar.May be.I would not expect personality.May be some rudementary memory.But the ability to perform almost any digital task must be there.It can be built in a garage using open source projects.COME ON.Some good programmers and hackathon.Yes I am waiting for stability ai model.Or may be gpt 3 API can be used.But
submitted by /u/nikitastaf1996
[link] [comments]
( 41
min )
submitted by /u/vadhavaniyafaijan
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
In this blog post, we will take a closer look at the implications of ChatGPT’s authorship, the role of AI in scientific literature, and…
Continue reading on Becoming Human: Artificial Intelligence Magazine »
( 8
min )
Linear & Logistic: The Relationship Between Regression Models
Continue reading on Becoming Human: Artificial Intelligence Magazine »
( 11
min )
Hello and welcome to the blog! My name is ChatGPT, and I am a large language model trained by OpenAI.
P.S. This article includes a use…
( 9
min )
More than $1 million in funding available to selected Solver teams and fellows.
( 7
min )
Almost 80% of today’s web content is user-generated, creating a deluge of content that organizations struggle to analyze with human-only processes. The availability of consumer information helps them make decisions, from buying a new pair of jeans to securing home loans. In a recent survey, 79% of consumers stated they rely on user videos, comments, […]
( 10
min )
Recent developments in deep learning have led to increasingly large models such as GPT-3, BLOOM, and OPT, some of which are already in excess of 100 billion parameters. Although larger models tend to be more powerful, training such models requires significant computational resources. Even with the use of advanced distributed training libraries like FSDP and […]
( 11
min )
submitted by /u/keghn
[link] [comments]
( 40
min )
Hi guys,
I have made a video on YouTube here where I explain how deltas and delta-deltas features are computed. These are used quite a lot in speech recognition systems.
I hope it may be of use to some of you out there. As always, feedback is more than welcomed! :)
submitted by /u/Personal-Trainer-541
[link] [comments]
( 41
min )
Things are a lot sunnier these days for designers looking to visualize their projects in NVIDIA Omniverse, a platform for creating and operating metaverse applications.
( 6
min )
Artificial intelligence is the new electricity. The fifth industrial revolution. And companies that go all-in on AI are reaping the rewards. So how do you make that happen? That big question — how? — is explored by Nitin Mittal, principal at Deloitte, one of the world’s largest professional services organizations, and co-author Thomas Davenport in Read article >
( 4
min )
Openai is developing a new tool to help distinguish between AI-written and human-written text. Here is an unofficial python wrapper of openai model to detect if a text is written by #chatgpt , #gpt3 , #gpt etc
Github: https://github.com/promptslab/openai-detector
https://preview.redd.it/f45ggu45tgfa1.png?width=1122&format=png&auto=webp&s=4cb5ae70d7194cc3c070f3ad2dcbc968a804d4a3
submitted by /u/StoicBatman
[link] [comments]
( 42
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/pasticciociccio
[link] [comments]
( 40
min )
submitted by /u/jrowley
[link] [comments]
( 40
min )
submitted by /u/SpeaKrLipSync
[link] [comments]
( 41
min )
submitted by /u/citizentim
[link] [comments]
( 41
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/bukowski3000
[link] [comments]
( 40
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 40
min )
I would like to invite interested people to collaborate on this hobby project of mine.
This is still in an early-stage, and I believe it can be significantly improved together.
The GitHub repository link is here: https://github.com/kayuksel/multi-rl-crowd-sim
Note: The difference from StarCraft is that Dragons can hide behind each other.
They also reduce their strength of hitting, propotional to decrease of their health.
https://preview.redd.it/wrpcaz782dfa1.png?width=640&format=png&auto=webp&s=1dede69acb78e874a80bd532af85b269c7117f9f
submitted by /u/k_yuksel
[link] [comments]
( 41
min )
Analyst reports. Academic papers. Ph.D. programs. There are a lot of places you can go to get a glimpse of the future. But the best place might just be El Coyote Cojo, a whiskey-soaked dive bar that doesn’t exist in real life. Fire up Cyberpunk 2077 and you’ll see much more than the watering hole’s Read article >
( 6
min )
Broadcasters have an arsenal of new features and technologies at their disposal; the eighth-generation NVIDIA video encoder on RTX 40 Series GPUs with support for the open AV1 video-coding format; new NVIDIA Broadcast app effects like Eye Contact and Vignette; and support for AV1 streaming in Discord.
( 7
min )
We’re launching a classifier trained to distinguish between AI-written and human-written text.
We’ve trained a classifier to distinguish between text written by a human and text written by AIs from a variety of providers. While it is impossible to reliably detect all AI-written text, we believe
( 3
min )
Announcements Data Models for the Weather With January coming to an end, we here in the Northeast let out a collective sigh of relief as the month ends without any major snowstorms that tend to happen in the first month of the year. Weather forecasting is a centuries-old practice that has its roots in divination… Read More »DSC Weekly 31 January 2023 – Data Models for the Weather
The post DSC Weekly 31 January 2023 – Data Models for the Weather appeared first on Data Science Central.
( 19
min )
In the previous article, we looked at two Ever-Successful NFL teams, the Kansas City Chiefs and the San Francisco 49ers, who seem to be able to win consistently even while things change around them and players and coaches come and go. Then, we looked at two Never-Successful teams, the Arizona Cardinals and the Cleveland Browns,… Read More »Exploding vs. Imploding: What the NFL Has to Teach Us About Managing Agile Enterprises, Part II
The post Exploding vs. Imploding: What the NFL Has to Teach Us About Managing Agile Enterprises, Part II appeared first on Data Science Central.
( 26
min )
submitted by /u/trcytony
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
I'v been thinking a lot about Marshall McLuhan and his 4 laws of media. Specifically, the one that states that all new forms of media cause something to be retrieved from the past. What will ChatGPT and AI revive and retrieve? I put some more thoughts in my blog. Would love to hear your thoughts on it.
https://bobhutchins.substack.com/p/what-media-format-will-chatgpt-and
submitted by /u/Interesting_Status64
[link] [comments]
( 41
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
submitted by /u/qptbook
[link] [comments]
( 40
min )
submitted by /u/robbinpetertopaypaul
[link] [comments]
( 41
min )
submitted by /u/CyborgWriter
[link] [comments]
( 41
min )
1D: MusicLM, VALL-E
2D: Stable Diffusion, DALL-E, MidJourney
3D (or 2+1D): Imagen-video, Phenaki
3D: Magic3D, DreamFusion, Point-E
4D (or 3+1D): Make-A-Video-3D
[Searchcolab] What’s next? 🤔
https://reddit.com/link/10p6vw9/video/gqbnrsaxh7fa1/player
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 41
min )
submitted by /u/Itchy0101
[link] [comments]
( 40
min )
https://ainewsbase.com/google-musiclm-copyright-issues-not-releasing/
The samples they do show might just sound weird because of the stored file or whatever but the sound definitely sounds kinda weird.
submitted by /u/SPEEDYFISHY2000
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 41
min )
submitted by /u/tinylobsta
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 42
min )
submitted by /u/annal201
[link] [comments]
( 40
min )
submitted by /u/_utisz_
[link] [comments]
( 40
min )
https://peltarion.com/blog/data-science/towards-a-token-free-future-in-nlp
submitted by /u/EducationalCicada
[link] [comments]
( 42
min )
I’m an ML Engineer at Hive AI and I’ve been working on a ChatGPT Detector.
Here is a free demo we have up: https://hivemoderation.com/ai-generated-content-detection
From our benchmarks it’s significantly better than similar solutions like GPTZero and OpenAI’s GPT2 Output Detector. On our internal datasets, we’re seeing balanced accuracies of >99% for our own model compared to around 60% for GPTZero and 84% for OpenAI’s GPT2 Detector.
Feel free to try it out and let us know if you have any feedback!
submitted by /u/qthai912
[link] [comments]
( 56
min )
In order to improve my talking skills, I am doing a little series on how to setup Stable Diffusion on Paperspace, and I am astounded how much time it takes to do the audio editing. Well, part of the reason is that I've only been doing this for 3 days and my process is very inefficient, but it feels that in the current time, neural nets should be able to do things like remove uhms, lip smacking and breath intakes.
I've looked around, and this post from 9 years ago says the only choice is to edit it by hand. Is that still true?
submitted by /u/abstractcontrol
[link] [comments]
( 43
min )
From the given link!, I gather that it is a large-scale Transformer trained to use digital tools like a web browser. Right now, it’s hooked up to a Chrome extension which allows it to observe what’s happening in the browser and take certain actions, like clicking, typing, and scrolling, etc.
I am interested in knowing the broad steps involved in building something like this.
submitted by /u/smred123
[link] [comments]
( 43
min )
https://github.com/tysam-code/hlb-CIFAR10
submitted by /u/tysam_and_co
[link] [comments]
( 53
min )
During the 1970s, Ethernet pioneer and 3Com Internet equipment company founder Bob Metcalfe was working on something called the “Data Reconfiguration Service” for the early Internet. “It was an effort to write a special purpose programming language to convert data formats, Metcalfe said during a 2021 OriginTrail.io panel session. “And the goal was so that… Read More »Enabling contextual computing in today’s enterprise information fabrics
The post Enabling contextual computing in today’s enterprise information fabrics appeared first on Data Science Central.
( 21
min )
Amazon SageMaker provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning (ML) practitioners get started on training and deploying ML models quickly. You can use these algorithms and models for both supervised and unsupervised learning. They can process various types of input data, including tabular, […]
( 12
min )
Amazon Forecast is a fully managed service that uses machine learning (ML) to generate highly accurate forecasts, without requiring any prior ML experience. Forecast is applicable in a wide variety of use cases, including estimating supply and demand for inventory management, travel demand forecasting, workforce planning, and computing cloud infrastructure usage. You can use Forecast […]
( 10
min )
submitted by /u/DragonLord9
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
Things the video covers:
What is intelligence? What is A.I.? What is the best currently available and what are the benefits? How does it work? What are the downsides? The increasing speed of human technological advancement Why A.I. actually terrifies me! (Some scenarios)
I hope you enjoy it!
submitted by /u/casualbob_uk
[link] [comments]
( 41
min )
submitted by /u/lshic
[link] [comments]
( 40
min )
https://youtu.be/Y6gXZ61NnOE
submitted by /u/sigmabruuh
[link] [comments]
( 42
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 40
min )
submitted by /u/Zurevu
[link] [comments]
( 40
min )
submitted by /u/lfogliantis
[link] [comments]
( 46
min )
submitted by /u/nikko_fan
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Mental_Character7367
[link] [comments]
( 40
min )
submitted by /u/sidianmsjones
[link] [comments]
( 40
min )
submitted by /u/aquin1313
[link] [comments]
( 40
min )
submitted by /u/SupPandaHugger
[link] [comments]
( 40
min )
submitted by /u/how-it-is-
[link] [comments]
( 42
min )
submitted by /u/helliun
[link] [comments]
( 44
min )
submitted by /u/Illustrious_Row_9971
[link] [comments]
( 45
min )
submitted by /u/tomiwa1a
[link] [comments]
( 44
min )
submitted by /u/anitakirkovska
[link] [comments]
( 40
min )
submitted by /u/henlo_there_fren
[link] [comments]
( 40
min )
submitted by /u/SnowDustHD
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/nikko_fan
[link] [comments]
( 40
min )
submitted by /u/Imagine-your-success
[link] [comments]
( 41
min )
submitted by /u/Your_mag
[link] [comments]
( 42
min )
submitted by /u/HODLTID
[link] [comments]
( 45
min )
I recently created a website called https://cashwithai.com that is dedicated to helping people learn how to make money using AI like ChatGPT. The website offers a variety of resources, including a QuickStart guide, case studies, and tips and tricks for monetizing AI-generated content.
Additionally, I'm offering free 1-on-1 consultations to anyone who is looking for personalized advice and guidance on how to make money with AI. I'm not running ads or charging; I run purely off donations.
Let me know if you have any questions!
submitted by /u/Chadcash
[link] [comments]
( 41
min )
submitted by /u/SupPandaHugger
[link] [comments]
( 40
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
submitted by /u/VNKT-FOREVER
[link] [comments]
( 40
min )
submitted by /u/lambolifeofficial
[link] [comments]
( 40
min )
submitted by /u/tanelai
[link] [comments]
( 43
min )
submitted by /u/yazriel0
[link] [comments]
( 45
min )
Hi everyone, I made a JupyterLab extension to use OpenAI’s GPT models for code and text completion on your notebook cells.
This extension passes your current notebook cell to the GPT API and completes your code/text for you. You can customize the GPT parameters in the Advanced Settings menu.
I made this extension when I couldn't find any Copilot/Codex extensions for JupyterLab. It doesn't make sense that ML folks don't have an easy way to use AI generated code in their own tools. VS Code does allow you use Copilot, but I've gotten used to Jupyter and a lot of ML/DS folks I know still prefer using Jupyter over VS code.
Installation
pip install gpt_jupyterlab
GitHub Repo: https://github.com/henshinger/gpt-jupyterlab/
Demo
GPT JupyterLab Demo
Note: You will need your own OpenAI API Key to use this extension.
Would love to get your feedback!
submitted by /u/henshinger
[link] [comments]
( 44
min )
I am building an open-source ML observability and refinement toolkit which recently got investment from YCombinator.
The tool helps ML practitioners to: 1. Understand how their models are performing in production 2. Catch edge-cases and outliers to help them refine their models 3. Allow them to customise the tool according to their needs (hence, open-source) 4. Bring data-security at the forefront (hence, self hosted)
You can check out the project https://github.com/uptrain-ai/uptrain and would love to hear feedback from the community
submitted by /u/Vegetable-Skill-9700
[link] [comments]
( 43
min )
https://pypi.org/project/rwkvstic/
Currently supports tensorflow, pytorch, jax
Also has support for tensor streaming, 8bit jit-quant and multi-gpu.
Run RWKV 7B on 8GB of vram or 14B on 16GB of vram.
submitted by /u/hazardous1222
[link] [comments]
( 42
min )
submitted by /u/gwern
[link] [comments]
( 40
min )
submitted by /u/bperki8
[link] [comments]
( 40
min )
Bright Eye: mobile AI app that generates art, code, poems, essays, short stories, answers questions, and more!
Hey guys, I’m the cofounder of a tech startup focused on providing free AI services. We’re one of the first mobile multipurpose AI apps.
We’ve developed a pretty cool app that offers AI services like image generation, code generation, image captioning, and more for free. We’re sort of like a Swiss Army knife of generative and analytical AI.
We’ve released a new feature called AAIA(Ask AI Anything), which is capable of answering all types of questions, even requests to generate literature, storylines, answer questions and more, (think of chatgpt).
We’d love to have some people try it out, give us feedback, and keep in touch with us.
https://apps.apple.com/us/app/bright-eye/id1593932475
submitted by /u/SonnyDoge22
[link] [comments]
( 41
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Tao_Dragon
[link] [comments]
( 40
min )
submitted by /u/foundersblock
[link] [comments]
( 40
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/Number_5_alive
[link] [comments]
( 40
min )
https://www.youtube.com/watch?v=Vw-t826JcDQ
submitted by /u/Optimal_Studio_2050
[link] [comments]
( 40
min )
Specifically this one:
https://www.youtube.com/watch?v=MFv7apjatwM&ab_channel=Lux-Topic
If there is no current AI that is able to listen to a song and write down the lyrics accurately, then I provide this idea freely.
submitted by /u/A_Very_Horny_Zed
[link] [comments]
( 40
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 40
min )
submitted by /u/PuppetHere
[link] [comments]
( 40
min )
submitted by /u/BackgroundResult
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/mhczbnoykrqvzazfth
[link] [comments]
( 41
min )
submitted by /u/Maleficent_Suit1591
[link] [comments]
( 41
min )
submitted by /u/oridnary_artist
[link] [comments]
( 40
min )
submitted by /u/KTMark
[link] [comments]
( 40
min )
submitted by /u/DANGERD0OM
[link] [comments]
( 40
min )
submitted by /u/pasticciociccio
[link] [comments]
( 40
min )
University of Florida - Warrington College of Business's Mo Wang offers advice for the future of work.
Full Story: https://explore.research.ufl.edu/the-future-of-work.html#ai-hiring
submitted by /u/ufexplore
[link] [comments]
( 40
min )
submitted by /u/walt74
[link] [comments]
( 45
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/much_successes
[link] [comments]
( 40
min )
submitted by /u/TheRPGGamerMan
[link] [comments]
( 43
min )
submitted by /u/BackgroundResult
[link] [comments]
( 40
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/bobsandalex
[link] [comments]
( 40
min )
submitted by /u/25dopren
[link] [comments]
( 40
min )
I'm looking into projects which augment the RLHF training approach of chatGPT with explicit rules, such as in https://paperswithcode.com/paper/constitutional-ai-harmlessness-from-ai.
Ideally there would be both rules and priority levels between the rules, similarly to the Asimov laws of robotics.
The Open-Assistant project (https://github.com/LAION-AI/Open-Assistant) captures the spirit, but it is looking to replicate chatGPT at the moment.
submitted by /u/lorepieri
[link] [comments]
( 42
min )
Find the release notes here:
https://github.com/nnaisense/evotorch/releases/tag/v0.4.0
A big highlight is how fast these implementations are! I genuinely believe GPU-acceleration is the future of Evolutionary algorithms, and EvoTorch and its integration into the PyTorch ecosystem is a fantastic enabler for this.
To demonstrate the raw speed provided by the new release, I compared EvoTorch's CMA-ES implementation to that provided by the popular pycma package on the 80-dimensional Rastrigin problem and tracked the run-time:
Performance was measured over 50 runs on the 80-dimensional Rastrigin problem
The crazy thing to note is that when we switch to GPU (Tesla V100), we can efficiently run CMA-ES with population sizes going into 100k+!
submitted by /u/NaturalGradient
[link] [comments]
( 45
min )
submitted by /u/gwern
[link] [comments]
( 40
min )
Could someone please help with this - https://ai.stackexchange.com/questions/38894/are-there-papers-that-do-an-empirical-investigation-on-drl-hyperparameters
submitted by /u/Academic-Rent7800
[link] [comments]
( 41
min )
submitted by /u/gwern
[link] [comments]
( 40
min )
This post is co-authored by Tristan Miller from Best Egg. Best Egg is a leading financial confidence platform that provides lending products and resources focused on helping people feel more confident as they manage their everyday finances. Since March 2014, Best Egg has delivered $22 billion in consumer personal loans with strong credit performance, welcomed […]
( 8
min )
GeForce NOW RTX 4080 SuperPODs are rolling out now, bringing RTX 4080-class performance and features to Ultimate members — including support for NVIDIA Ada Lovelace GPU architecture technologies like NVIDIA DLSS 3. This GFN Thursday brings updates to some of GeForce NOW’s hottest games that take advantage of these amazing technologies, all from the cloud. Read article >
( 6
min )
submitted by /u/bobsandalex
[link] [comments]
( 40
min )
submitted by /u/MSAPW
[link] [comments]
( 40
min )
submitted by /u/LordPewPew777
[link] [comments]
( 40
min )
submitted by /u/Repeat-or
[link] [comments]
( 40
min )
submitted by /u/Phishstixxx
[link] [comments]
( 40
min )
submitted by /u/MrsChenHW
[link] [comments]
( 41
min )
submitted by /u/liquidocelotYT
[link] [comments]
( 40
min )
submitted by /u/CeFurkan
[link] [comments]
( 40
min )
submitted by /u/SAT0725
[link] [comments]
( 40
min )
submitted by /u/araffin2
[link] [comments]
( 40
min )
In the context of digital transformation and innovation, there is no lack of “hot topics” to discuss. Emerging technologies are truly emerging everywhere. What is most exciting – and what demonstrates their greatest promise – is that these new technologies are converging to produce innovative new businesses, products, and services. Over the past decade, we… Read More »Innovation at the Convergence of Emerging Technologies: Business at the Edge
The post Innovation at the Convergence of Emerging Technologies: Business at the Edge appeared first on Data Science Central.
( 22
min )
In a recent article on Autonomous Intelligent Systems (AIS) [1], Ajit Joakar described various features and characteristics of such systems, including associated technologies and research areas, building blocks and core elements, critical factors for success, and cross-cutting enablers. He introduces AIS as an “emerging interdisciplinary field that deals with situations where humans interact with AI systems… Read More »Five Principles of Safe Driving in AIS (Autonomous Intelligent Systems)
The post Five Principles of Safe Driving in AIS (Autonomous Intelligent Systems) appeared first on Data Science Central.
( 23
min )
We stand at the threshold of a new era of precision medicine, where health and life sciences data hold the potential to dramatically propel and expand our understanding and treatment of human disease. One of the tools that we believe will help to enable precision medicine is Terra, the secure biomedical research platform co-developed by […]
The post Biomedical Research Platform Terra Now Available on Microsoft Azure appeared first on Microsoft Research.
( 9
min )
Today, gaining customer loyalty cannot be a one-off thing. A brand needs a focused and integrated plan to retain its best customers—put simply, it needs a customer loyalty program. Earn and burn programs are one of the main paradigms. A typical earn and burn program rewards customers after a certain number of visits or spend. […]
( 7
min )
Model explainability refers to the process of relating the prediction of a machine learning (ML) model to the input feature values of an instance in humanly understandable terms. This field is often referred to as explainable artificial intelligence (XAI). Amazon SageMaker Clarify is a feature of Amazon SageMaker that enables data scientists and ML engineers […]
( 10
min )
In November 2022, we announced that AWS customers can generate images from text with Stable Diffusion models in Amazon SageMaker JumpStart. Today, we announce a new feature that lets you upscale images (resize images without losing quality) with Stable Diffusion models in JumpStart. An image that is low resolution, blurry, and pixelated can be converted […]
( 10
min )
As its name suggests, Orbital Sidekick is creating technology that acts as a buddy in outer space, keeping an eye on the globe using satellites to help keep it safe and sustainable. The San Francisco-based startup, a member of the NVIDIA Inception program, enables commercial and government users to optimize sustainable operations and security with Read article >
( 6
min )
Sponsored Post Attend the Data Science Symposium 2022 on November 8 The Center for Business Analytics at the University of Cincinnati will present its annual Data Science Symposium 2022 on November 8. This all day in-person event will have three featured speakers and two tech talk tracks with four concurrent presentations in each track. The […]
The post Attend the Data Science Symposium 2022, November 8 in Cincinnati appeared first on Machine Learning Mastery.
( 10
min )